You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Tamar Fraenkel <ta...@tok-media.com> on 2012/08/14 06:45:49 UTC
GCInspector info messages in cassandra log
Hi!
I have 3 nodes ring running on Amazon EC2.
About once a week I see in the logs compaction messages and around the same
time info messages about GC (see below) that I think means it is taking too
long and happening too often.
Does it mean I have to reduce my cache size?
Thanks,
Tamar
INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line
122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line
122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line
122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line
122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line
122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line
122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line
122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line
122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line
122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line
122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line
122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is
1937768448
INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line
122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is
1937768448
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
tamar@tok-media.com
Tel: +972 2 6409736
Mob: +972 54 8356490
Fax: +972 2 5612956
Re: GCInspector info messages in cassandra log
Posted by Tamar Fraenkel <ta...@tok-media.com>.
Thank you very much!
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
tamar@tok-media.com
Tel: +972 2 6409736
Mob: +972 54 8356490
Fax: +972 2 5612956
On Thu, Aug 16, 2012 at 12:11 AM, aaron morton <aa...@thelastpickle.com>wrote:
> Is there anything to do before that? like drain or flush?
>
> For a clean shutdown I do
>
> nodetool -h localhost disablethrift
> nodetool -h localhost disablegossip && sleep 10
> nodetool -h localhost drain
> then kill
>
> Would you recommend that? If I do it, how often should I do a full
> snapshot, and how often should I backup the backup directory?
>
> Sounds like you could use Priam and be happier...
> http://techblog.netflix.com/2012/02/announcing-priam.html
>
> I just saw that there is an option global_snapshot, is it still supported?
>
> I cannot find it.
>
> Try Piram or the instructions here, which are pretty much what you have
> described http://www.datastax.com/docs/1.0/operations/backup_restore
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15/08/2012, at 4:57 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>
> Aaron,
> Thank you very much. I will do as you suggested.
>
> One last question regarding restart:
> I assume, I should do it node by node.
> Is there anything to do before that? like drain or flush?
>
> I am also considering enabling incremental backups on my cluster.
> Currently I take a daily full snapshot of the cluster, tar it and load it
> to S3 (size now is 3.1GB). Would you recommend that? If I do it, how often
> should I do a full snapshot, and how often should I backup the backup
> directory?
>
> Another snapshot related question, currently I snapshot on each node and
> use parallel-slurp to copy the snapshot to one node where I tar them. I
> just saw that there is an option global_snapshot, is it still supported?
> Does that mean that if I run it on one node the snapshot will contain data
> from all cluster? How does it work in restore? Is it better than my current
> backup system?
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>
>
>
> On Tue, Aug 14, 2012 at 11:51 PM, aaron morton <aa...@thelastpickle.com>wrote:
>
>>
>> 1. According to cfstats there are the some CF with high Comacted row
>> maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
>> 1000000. Are these considered to be problematic, what can I do to solve
>> that?
>> 2.
>>
>> They are only 1, 4 and 25 MB. Not too big.
>>
>> What should be the values of in_memory_compaction_limit_in_mb
>> and concurrent_compactors and how do I change them?
>>
>> Sounds like you dont have very big CF's, so changing the
>> in_memory_compaction_limit_in_mb may not make too much difference.
>>
>> Try changing concurrent_compactors to 2 in the yaml file. This change
>> will let you know if GC and compaction are related.
>>
>>
>> change yaml file and restart,
>>
>> yes
>>
>> What do I do about the long rows? What value is considered too big.
>>
>> They churn more memory during compaction. If you have a lot of rows +32
>> MB I would think about it, does not look that way.
>>
>> Cheers
>>
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 15/08/2012, at 3:15 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>>
>> Hi!
>> It helps, but before I do more actions I want to give you some more info,
>> and ask some questions:
>>
>> *Related Info*
>>
>> 1. According to my yaml file (where do I see these parameters in the
>> jmx? I couldn't find them):
>> in_memory_compaction_limit_in_mb: 64
>> concurrent_compactors: 1, but it is commented out, so I guess it is
>> the default value
>> multithreaded_compaction: false
>> compaction_throughput_mb_per_sec: 16
>> compaction_preheat_key_cache: true
>> 2. According to cfstats there are the some CF with high Comacted row
>> maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
>> 1000000. Are these considered to be problematic, what can I do to solve
>> that?
>> 3. During compactions Cassandra is slower
>> 4. Running Cassandra Version 1.0.8
>>
>> *Questions*
>> What should be the values of in_memory_compaction_limit_in_mb
>> and concurrent_compactors and how do I change them? change yaml file
>> and restart, or can it be done using jmx without restarting Cassandra?
>> What do I do about the long rows? What value is considered too big.
>>
>> I appreciate your help! Thanks,
>>
>>
>>
>> *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>> tamar@tok-media.com
>> Tel: +972 2 6409736
>> Mob: +972 54 8356490
>> Fax: +972 2 5612956
>>
>>
>>
>>
>>
>> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aa...@thelastpickle.com>wrote:
>>
>>> There are a couple of steps you can take if compaction is causing GC.
>>>
>>> - if you have a lot of wide rows consider reducing
>>> the in_memory_compaction_limit_in_mb yaml setting. This will slow down
>>> compaction but will reduce the memory usage.
>>>
>>> - reduce concurrent_compactors
>>>
>>> Both of these may slow down compaction. Once you have GC under control
>>> you may want to play with memory settings.
>>>
>>> Hope that helps.
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>>>
>>> Hi!
>>> I have 3 nodes ring running on Amazon EC2.
>>> About once a week I see in the logs compaction messages and around the
>>> same time info messages about GC (see below) that I think means it is
>>> taking too long and happening too often.
>>>
>>> Does it mean I have to reduce my cache size?
>>> Thanks,
>>> Tamar
>>>
>>> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line
>>> 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line
>>> 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line
>>> 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line
>>> 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line
>>> 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line
>>> 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line
>>> 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line
>>> 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line
>>> 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line
>>> 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line
>>> 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is
>>> 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line
>>> 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is
>>> 1937768448
>>>
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media
>>>
>>> <tokLogo.png>
>>>
>>>
>>> tamar@tok-media.com
>>> Tel: +972 2 6409736
>>> Mob: +972 54 8356490
>>> Fax: +972 2 5612956
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
Re: GCInspector info messages in cassandra log
Posted by aaron morton <aa...@thelastpickle.com>.
> Is there anything to do before that? like drain or flush?
For a clean shutdown I do
nodetool -h localhost disablethrift
nodetool -h localhost disablegossip && sleep 10
nodetool -h localhost drain
then kill
> Would you recommend that? If I do it, how often should I do a full snapshot, and how often should I backup the backup directory?
Sounds like you could use Priam and be happier...
http://techblog.netflix.com/2012/02/announcing-priam.html
> I just saw that there is an option global_snapshot, is it still supported?
I cannot find it.
Try Piram or the instructions here, which are pretty much what you have described http://www.datastax.com/docs/1.0/operations/backup_restore
Cheers
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 15/08/2012, at 4:57 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
> Aaron,
> Thank you very much. I will do as you suggested.
>
> One last question regarding restart:
> I assume, I should do it node by node.
> Is there anything to do before that? like drain or flush?
>
> I am also considering enabling incremental backups on my cluster. Currently I take a daily full snapshot of the cluster, tar it and load it to S3 (size now is 3.1GB). Would you recommend that? If I do it, how often should I do a full snapshot, and how often should I backup the backup directory?
>
> Another snapshot related question, currently I snapshot on each node and use parallel-slurp to copy the snapshot to one node where I tar them. I just saw that there is an option global_snapshot, is it still supported? Does that mean that if I run it on one node the snapshot will contain data from all cluster? How does it work in restore? Is it better than my current backup system?
>
> Tamar Fraenkel
> Senior Software Engineer, TOK Media
>
>
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>
>
>
> On Tue, Aug 14, 2012 at 11:51 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752, 4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic, what can I do to solve that?
> They are only 1, 4 and 25 MB. Not too big.
>
>> What should be the values of in_memory_compaction_limit_in_mb and concurrent_compactors and how do I change them?
> Sounds like you dont have very big CF's, so changing the in_memory_compaction_limit_in_mb may not make too much difference.
>
> Try changing concurrent_compactors to 2 in the yaml file. This change will let you know if GC and compaction are related.
>
>> change yaml file and restart,
> yes
>
>> What do I do about the long rows? What value is considered too big.
> They churn more memory during compaction. If you have a lot of rows +32 MB I would think about it, does not look that way.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15/08/2012, at 3:15 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>
>> Hi!
>> It helps, but before I do more actions I want to give you some more info, and ask some questions:
>>
>> Related Info
>> According to my yaml file (where do I see these parameters in the jmx? I couldn't find them):
>> in_memory_compaction_limit_in_mb: 64
>> concurrent_compactors: 1, but it is commented out, so I guess it is the default value
>> multithreaded_compaction: false
>> compaction_throughput_mb_per_sec: 16
>> compaction_preheat_key_cache: true
>> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752, 4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic, what can I do to solve that?
>> During compactions Cassandra is slower
>> Running Cassandra Version 1.0.8
>> Questions
>> What should be the values of in_memory_compaction_limit_in_mb and concurrent_compactors and how do I change them? change yaml file and restart, or can it be done using jmx without restarting Cassandra?
>> What do I do about the long rows? What value is considered too big.
>>
>> I appreciate your help!
>> Thanks,
>>
>>
>>
>> Tamar Fraenkel
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>> tamar@tok-media.com
>> Tel: +972 2 6409736
>> Mob: +972 54 8356490
>> Fax: +972 2 5612956
>>
>>
>>
>>
>>
>> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> There are a couple of steps you can take if compaction is causing GC.
>>
>> - if you have a lot of wide rows consider reducing the in_memory_compaction_limit_in_mb yaml setting. This will slow down compaction but will reduce the memory usage.
>>
>> - reduce concurrent_compactors
>>
>> Both of these may slow down compaction. Once you have GC under control you may want to play with memory settings.
>>
>> Hope that helps.
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>>
>>> Hi!
>>> I have 3 nodes ring running on Amazon EC2.
>>> About once a week I see in the logs compaction messages and around the same time info messages about GC (see below) that I think means it is taking too long and happening too often.
>>>
>>> Does it mean I have to reduce my cache size?
>>> Thanks,
>>> Tamar
>>>
>>> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is 1937768448
>>> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is 1937768448
>>>
>>>
>>> Tamar Fraenkel
>>> Senior Software Engineer, TOK Media
>>>
>>> <tokLogo.png>
>>>
>>>
>>> tamar@tok-media.com
>>> Tel: +972 2 6409736
>>> Mob: +972 54 8356490
>>> Fax: +972 2 5612956
>>>
>>>
>>>
>>
>>
>
>
Re: GCInspector info messages in cassandra log
Posted by Tamar Fraenkel <ta...@tok-media.com>.
Aaron,
Thank you very much. I will do as you suggested.
One last question regarding restart:
I assume, I should do it node by node.
Is there anything to do before that? like drain or flush?
I am also considering enabling incremental backups on my cluster. Currently
I take a daily full snapshot of the cluster, tar it and load it to S3 (size
now is 3.1GB). Would you recommend that? If I do it, how often should I do
a full snapshot, and how often should I backup the backup directory?
Another snapshot related question, currently I snapshot on each node and
use parallel-slurp to copy the snapshot to one node where I tar them. I
just saw that there is an option global_snapshot, is it still supported?
Does that mean that if I run it on one node the snapshot will contain data
from all cluster? How does it work in restore? Is it better than my current
backup system?
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
tamar@tok-media.com
Tel: +972 2 6409736
Mob: +972 54 8356490
Fax: +972 2 5612956
On Tue, Aug 14, 2012 at 11:51 PM, aaron morton <aa...@thelastpickle.com>wrote:
>
> 1. According to cfstats there are the some CF with high Comacted row
> maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
> 1000000. Are these considered to be problematic, what can I do to solve
> that?
> 2.
>
> They are only 1, 4 and 25 MB. Not too big.
>
> What should be the values of in_memory_compaction_limit_in_mb
> and concurrent_compactors and how do I change them?
>
> Sounds like you dont have very big CF's, so changing the
> in_memory_compaction_limit_in_mb may not make too much difference.
>
> Try changing concurrent_compactors to 2 in the yaml file. This change
> will let you know if GC and compaction are related.
>
>
> change yaml file and restart,
>
> yes
>
> What do I do about the long rows? What value is considered too big.
>
> They churn more memory during compaction. If you have a lot of rows +32 MB
> I would think about it, does not look that way.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15/08/2012, at 3:15 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>
> Hi!
> It helps, but before I do more actions I want to give you some more info,
> and ask some questions:
>
> *Related Info*
>
> 1. According to my yaml file (where do I see these parameters in the
> jmx? I couldn't find them):
> in_memory_compaction_limit_in_mb: 64
> concurrent_compactors: 1, but it is commented out, so I guess it is
> the default value
> multithreaded_compaction: false
> compaction_throughput_mb_per_sec: 16
> compaction_preheat_key_cache: true
> 2. According to cfstats there are the some CF with high Comacted row
> maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
> 1000000. Are these considered to be problematic, what can I do to solve
> that?
> 3. During compactions Cassandra is slower
> 4. Running Cassandra Version 1.0.8
>
> *Questions*
> What should be the values of in_memory_compaction_limit_in_mb
> and concurrent_compactors and how do I change them? change yaml file and
> restart, or can it be done using jmx without restarting Cassandra?
> What do I do about the long rows? What value is considered too big.
>
> I appreciate your help! Thanks,
>
>
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> <tokLogo.png>
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>
>
>
> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> There are a couple of steps you can take if compaction is causing GC.
>>
>> - if you have a lot of wide rows consider reducing
>> the in_memory_compaction_limit_in_mb yaml setting. This will slow down
>> compaction but will reduce the memory usage.
>>
>> - reduce concurrent_compactors
>>
>> Both of these may slow down compaction. Once you have GC under control
>> you may want to play with memory settings.
>>
>> Hope that helps.
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>>
>> Hi!
>> I have 3 nodes ring running on Amazon EC2.
>> About once a week I see in the logs compaction messages and around the
>> same time info messages about GC (see below) that I think means it is
>> taking too long and happening too often.
>>
>> Does it mean I have to reduce my cache size?
>> Thanks,
>> Tamar
>>
>> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line
>> 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line
>> 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line
>> 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line
>> 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line
>> 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line
>> 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line
>> 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line
>> 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line
>> 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line
>> 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line
>> 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is
>> 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line
>> 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is
>> 1937768448
>>
>>
>> *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>>
>> tamar@tok-media.com
>> Tel: +972 2 6409736
>> Mob: +972 54 8356490
>> Fax: +972 2 5612956
>>
>>
>>
>>
>>
>
>
Re: GCInspector info messages in cassandra log
Posted by aaron morton <aa...@thelastpickle.com>.
> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752, 4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic, what can I do to solve that?
They are only 1, 4 and 25 MB. Not too big.
> What should be the values of in_memory_compaction_limit_in_mb and concurrent_compactors and how do I change them?
Sounds like you dont have very big CF's, so changing the in_memory_compaction_limit_in_mb may not make too much difference.
Try changing concurrent_compactors to 2 in the yaml file. This change will let you know if GC and compaction are related.
> change yaml file and restart,
yes
> What do I do about the long rows? What value is considered too big.
They churn more memory during compaction. If you have a lot of rows +32 MB I would think about it, does not look that way.
Cheers
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 15/08/2012, at 3:15 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
> Hi!
> It helps, but before I do more actions I want to give you some more info, and ask some questions:
>
> Related Info
> According to my yaml file (where do I see these parameters in the jmx? I couldn't find them):
> in_memory_compaction_limit_in_mb: 64
> concurrent_compactors: 1, but it is commented out, so I guess it is the default value
> multithreaded_compaction: false
> compaction_throughput_mb_per_sec: 16
> compaction_preheat_key_cache: true
> According to cfstats there are the some CF with high Comacted row maximum sizes (1131752, 4866323 and 25109160). Others max sizes are < 1000000. Are these considered to be problematic, what can I do to solve that?
> During compactions Cassandra is slower
> Running Cassandra Version 1.0.8
> Questions
> What should be the values of in_memory_compaction_limit_in_mb and concurrent_compactors and how do I change them? change yaml file and restart, or can it be done using jmx without restarting Cassandra?
> What do I do about the long rows? What value is considered too big.
>
> I appreciate your help!
> Thanks,
>
>
>
> Tamar Fraenkel
> Senior Software Engineer, TOK Media
>
> <tokLogo.png>
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>
>
>
> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aa...@thelastpickle.com> wrote:
> There are a couple of steps you can take if compaction is causing GC.
>
> - if you have a lot of wide rows consider reducing the in_memory_compaction_limit_in_mb yaml setting. This will slow down compaction but will reduce the memory usage.
>
> - reduce concurrent_compactors
>
> Both of these may slow down compaction. Once you have GC under control you may want to play with memory settings.
>
> Hope that helps.
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>
>> Hi!
>> I have 3 nodes ring running on Amazon EC2.
>> About once a week I see in the logs compaction messages and around the same time info messages about GC (see below) that I think means it is taking too long and happening too often.
>>
>> Does it mean I have to reduce my cache size?
>> Thanks,
>> Tamar
>>
>> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is 1937768448
>> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is 1937768448
>>
>>
>> Tamar Fraenkel
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>>
>> tamar@tok-media.com
>> Tel: +972 2 6409736
>> Mob: +972 54 8356490
>> Fax: +972 2 5612956
>>
>>
>>
>
>
Re: GCInspector info messages in cassandra log
Posted by Tamar Fraenkel <ta...@tok-media.com>.
Hi!
It helps, but before I do more actions I want to give you some more info,
and ask some questions:
*Related Info*
1. According to my yaml file (where do I see these parameters in the jmx?
I couldn't find them):
in_memory_compaction_limit_in_mb: 64
concurrent_compactors: 1, but it is commented out, so I guess it is the
default value
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
2. According to cfstats there are the some CF with high Comacted row
maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
1000000. Are these considered to be problematic, what can I do to solve
that?
3. During compactions Cassandra is slower
4. Running Cassandra Version 1.0.8
*Questions*
What should be the values of in_memory_compaction_limit_in_mb
and concurrent_compactors and how do I change them? change yaml file and
restart, or can it be done using jmx without restarting Cassandra?
What do I do about the long rows? What value is considered too big.
I appreciate your help! Thanks,
*Tamar Fraenkel *
Senior Software Engineer, TOK Media
[image: Inline image 1]
tamar@tok-media.com
Tel: +972 2 6409736
Mob: +972 54 8356490
Fax: +972 2 5612956
On Tue, Aug 14, 2012 at 1:22 PM, aaron morton <aa...@thelastpickle.com>wrote:
> There are a couple of steps you can take if compaction is causing GC.
>
> - if you have a lot of wide rows consider reducing
> the in_memory_compaction_limit_in_mb yaml setting. This will slow down
> compaction but will reduce the memory usage.
>
> - reduce concurrent_compactors
>
> Both of these may slow down compaction. Once you have GC under control
> you may want to play with memory settings.
>
> Hope that helps.
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>
> Hi!
> I have 3 nodes ring running on Amazon EC2.
> About once a week I see in the logs compaction messages and around the
> same time info messages about GC (see below) that I think means it is
> taking too long and happening too often.
>
> Does it mean I have to reduce my cache size?
> Thanks,
> Tamar
>
> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line
> 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line
> 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line
> 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line
> 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line
> 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line
> 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line
> 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line
> 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line
> 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line
> 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line
> 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is
> 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line
> 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is
> 1937768448
>
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> <tokLogo.png>
>
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>
>
>
Re: GCInspector info messages in cassandra log
Posted by aaron morton <aa...@thelastpickle.com>.
There are a couple of steps you can take if compaction is causing GC.
- if you have a lot of wide rows consider reducing the in_memory_compaction_limit_in_mb yaml setting. This will slow down compaction but will reduce the memory usage.
- reduce concurrent_compactors
Both of these may slow down compaction. Once you have GC under control you may want to play with memory settings.
Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 14/08/2012, at 4:45 PM, Tamar Fraenkel <ta...@tok-media.com> wrote:
> Hi!
> I have 3 nodes ring running on Amazon EC2.
> About once a week I see in the logs compaction messages and around the same time info messages about GC (see below) that I think means it is taking too long and happening too often.
>
> Does it mean I have to reduce my cache size?
> Thanks,
> Tamar
>
> INFO [ScheduledTasks:1] 2012-08-13 12:50:57,593 GCInspector.java (line 122) GC for ParNew: 242 ms for 1 collections, 1541590352 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:27,740 GCInspector.java (line 122) GC for ParNew: 291 ms for 1 collections, 1458227032 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:29,741 GCInspector.java (line 122) GC for ParNew: 261 ms for 1 collections, 1228861368 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:30,833 GCInspector.java (line 122) GC for ParNew: 319 ms for 1 collections, 1120131360 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:32,863 GCInspector.java (line 122) GC for ParNew: 241 ms for 1 collections, 983144216 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:33,864 GCInspector.java (line 122) GC for ParNew: 215 ms for 1 collections, 967702720 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:34,964 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 973803344 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:41,211 GCInspector.java (line 122) GC for ParNew: 265 ms for 1 collections, 1071933560 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:43,212 GCInspector.java (line 122) GC for ParNew: 326 ms for 1 collections, 1217367792 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:44,212 GCInspector.java (line 122) GC for ParNew: 245 ms for 1 collections, 1203481536 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:45,213 GCInspector.java (line 122) GC for ParNew: 209 ms for 1 collections, 1208819416 used; max is 1937768448
> INFO [ScheduledTasks:1] 2012-08-13 12:51:46,237 GCInspector.java (line 122) GC for ParNew: 248 ms for 1 collections, 1338361648 used; max is 1937768448
>
>
> Tamar Fraenkel
> Senior Software Engineer, TOK Media
>
> <tokLogo.png>
>
> tamar@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
>