You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Hiller, Dean" <De...@nrel.gov> on 2013/02/20 19:49:59 UTC
very confused by jmap dump of cassandra
I took this jmap dump of cassandra(in production). Before I restarted the whole production cluster, I had some nodes running compaction and it looked like all memory had been consumed(kind of like cassandra is not clearing out the caches or memtables fast enough). I am trying to still debug compaction causes slowness on the cluster since all cassandra.yaml files are pretty much the defaults with size tiered compaction.
The weird thing is I dump and get a 5.4G heap.bin file and load that into ecipse who tells me total is 142.8MB….what???? So low????, top was showing 1.9G at the time(and I took this top snapshot later(2 hours after)… (how is eclipse profile telling me the jmap showed 142.8MB in use instead of 1.9G in use?)
Tasks: 398 total, 1 running, 397 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8%us, 0.5%sy, 0.0%ni, 96.5%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 32854680k total, 31910708k used, 943972k free, 89776k buffers
Swap: 33554424k total, 18288k used, 33536136k free, 23428596k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20909 cassandr 20 0 64.1g 9.2g 2.1g S 75.7 29.4 182:37.92 java
22455 cassandr 20 0 15288 1340 824 R 3.9 0.0 0:00.02 top
It almost seems like cassandra is not being good about memory management here as we slowly get into a situation where compaction is run which takes out our memory(configured for 8G). I can easily go higher than 8G on these systems as I have 32gig each node, but there was docs that said 8G is better for GC. Has anyone else taken a jmap dump of cassandra?
Thanks,
Dean
Re: very confused by jmap dump of cassandra
Posted by aaron morton <aa...@thelastpickle.com>.
Cannot comment too much on the jmap but I can add my general "compaction is hurting" strategy.
Try any or all of the following to get to a stable setup, then increase until things go bang.
Set concurrent compactors to 2.
Reduce compaction throughput by half.
Reduce in_memory_compaction_limit.
If you see compactions using a lot of sstables in the logs, reduce max_compaction_threshold.
> I can easily go higher than 8G on these systems as I have 32gig each node, but there was docs that said 8G is better for GC.
More JVM memory is not the answer.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 21/02/2013, at 7:49 AM, "Hiller, Dean" <De...@nrel.gov> wrote:
> I took this jmap dump of cassandra(in production). Before I restarted the whole production cluster, I had some nodes running compaction and it looked like all memory had been consumed(kind of like cassandra is not clearing out the caches or memtables fast enough). I am trying to still debug compaction causes slowness on the cluster since all cassandra.yaml files are pretty much the defaults with size tiered compaction.
>
> The weird thing is I dump and get a 5.4G heap.bin file and load that into ecipse who tells me total is 142.8MB….what???? So low????, top was showing 1.9G at the time(and I took this top snapshot later(2 hours after)… (how is eclipse profile telling me the jmap showed 142.8MB in use instead of 1.9G in use?)
>
> Tasks: 398 total, 1 running, 397 sleeping, 0 stopped, 0 zombie
> Cpu(s): 2.8%us, 0.5%sy, 0.0%ni, 96.5%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 32854680k total, 31910708k used, 943972k free, 89776k buffers
> Swap: 33554424k total, 18288k used, 33536136k free, 23428596k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 20909 cassandr 20 0 64.1g 9.2g 2.1g S 75.7 29.4 182:37.92 java
> 22455 cassandr 20 0 15288 1340 824 R 3.9 0.0 0:00.02 top
>
> It almost seems like cassandra is not being good about memory management here as we slowly get into a situation where compaction is run which takes out our memory(configured for 8G). I can easily go higher than 8G on these systems as I have 32gig each node, but there was docs that said 8G is better for GC. Has anyone else taken a jmap dump of cassandra?
>
> Thanks,
> Dean
>
Re: very confused by jmap dump of cassandra
Posted by "Hiller, Dean" <De...@nrel.gov>.
(Thanks Aaron, I will try those stepsŠ.why are those steps not on this
page
http://www.datastax.com/docs/1.0/operations/tuning#tuning-java-heap-size )
I don't have much disk used at this point(roughly 130G per node). Here
are the numbers for the data mount and the commitlog mount(commit log is
separate disk).
[root@sdi-ci controlcenter]# clush -g datanodes df -h /opt/datastore/data
a1.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a1.bigde.nrel.gov: /dev/sda2 1.0T 129G 895G 13%
/opt/datastore/data
a5.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a5.bigde.nrel.gov: /dev/sda2 1.0T 130G 895G 13%
/opt/datastore/data
a3.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a3.bigde.nrel.gov: /dev/sda2 1.0T 130G 895G 13%
/opt/datastore/data
a2.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a2.bigde.nrel.gov: /dev/sda2 1.0T 130G 895G 13%
/opt/datastore/data
a4.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a4.bigde.nrel.gov: /dev/sda2 1.0T 130G 895G 13%
/opt/datastore/data
a6.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a6.bigde.nrel.gov: /dev/sda2 1.0T 129G 895G 13%
/opt/datastore/data
[root@sdi-ci controlcenter]# clush -g datanodes df -h
/opt/datastore/commitlog
a1.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a1.bigde.nrel.gov: /dev/sdb1 11G 1.2G 9.2G 12%
/opt/datastore/commitlog
a5.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a5.bigde.nrel.gov: /dev/sdb1 11G 1.1G 10G 10%
/opt/datastore/commitlog
a2.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a2.bigde.nrel.gov: /dev/sdb1 11G 1.1G 10G 10%
/opt/datastore/commitlog
a3.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a3.bigde.nrel.gov: /dev/sdb1 11G 1.1G 10G 10%
/opt/datastore/commitlog
a4.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a4.bigde.nrel.gov: /dev/sdb1 11G 1.1G 10G 10%
/opt/datastore/commitlog
a6.bigde.nrel.gov: Filesystem Size Used Avail Use% Mounted on
a6.bigde.nrel.gov: /dev/sdb1 11G 1.1G 10G 10%
/opt/datastore/commitlog
[root@sdi-ci controlcenter]#
On 2/21/13 10:33 AM, "Mohit Anchlia" <mo...@gmail.com> wrote:
>Roughly how much data do you have per node?
>
>Sent from my iPhone
>
>On Feb 20, 2013, at 10:49 AM, "Hiller, Dean" <De...@nrel.gov> wrote:
>
>> I took this jmap dump of cassandra(in production). Before I restarted
>>the whole production cluster, I had some nodes running compaction and it
>>looked like all memory had been consumed(kind of like cassandra is not
>>clearing out the caches or memtables fast enough). I am trying to still
>>debug compaction causes slowness on the cluster since all cassandra.yaml
>>files are pretty much the defaults with size tiered compaction.
>>
>> The weird thing is I dump and get a 5.4G heap.bin file and load that
>>into ecipse who tells me total is 142.8MBŠ.what???? So low????, top was
>>showing 1.9G at the time(and I took this top snapshot later(2 hours
>>after)Š (how is eclipse profile telling me the jmap showed 142.8MB in
>>use instead of 1.9G in use?)
>>
>> Tasks: 398 total, 1 running, 397 sleeping, 0 stopped, 0 zombie
>> Cpu(s): 2.8%us, 0.5%sy, 0.0%ni, 96.5%id, 0.1%wa, 0.0%hi, 0.1%si,
>>0.0%st
>> Mem: 32854680k total, 31910708k used, 943972k free, 89776k buffers
>> Swap: 33554424k total, 18288k used, 33536136k free, 23428596k cached
>>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 20909 cassandr 20 0 64.1g 9.2g 2.1g S 75.7 29.4 182:37.92 java
>> 22455 cassandr 20 0 15288 1340 824 R 3.9 0.0 0:00.02 top
>>
>> It almost seems like cassandra is not being good about memory
>>management here as we slowly get into a situation where compaction is
>>run which takes out our memory(configured for 8G). I can easily go
>>higher than 8G on these systems as I have 32gig each node, but there was
>>docs that said 8G is better for GC. Has anyone else taken a jmap dump
>>of cassandra?
>>
>> Thanks,
>> Dean
>>
Re: very confused by jmap dump of cassandra
Posted by Mohit Anchlia <mo...@gmail.com>.
Roughly how much data do you have per node?
Sent from my iPhone
On Feb 20, 2013, at 10:49 AM, "Hiller, Dean" <De...@nrel.gov> wrote:
> I took this jmap dump of cassandra(in production). Before I restarted the whole production cluster, I had some nodes running compaction and it looked like all memory had been consumed(kind of like cassandra is not clearing out the caches or memtables fast enough). I am trying to still debug compaction causes slowness on the cluster since all cassandra.yaml files are pretty much the defaults with size tiered compaction.
>
> The weird thing is I dump and get a 5.4G heap.bin file and load that into ecipse who tells me total is 142.8MB….what???? So low????, top was showing 1.9G at the time(and I took this top snapshot later(2 hours after)… (how is eclipse profile telling me the jmap showed 142.8MB in use instead of 1.9G in use?)
>
> Tasks: 398 total, 1 running, 397 sleeping, 0 stopped, 0 zombie
> Cpu(s): 2.8%us, 0.5%sy, 0.0%ni, 96.5%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 32854680k total, 31910708k used, 943972k free, 89776k buffers
> Swap: 33554424k total, 18288k used, 33536136k free, 23428596k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 20909 cassandr 20 0 64.1g 9.2g 2.1g S 75.7 29.4 182:37.92 java
> 22455 cassandr 20 0 15288 1340 824 R 3.9 0.0 0:00.02 top
>
> It almost seems like cassandra is not being good about memory management here as we slowly get into a situation where compaction is run which takes out our memory(configured for 8G). I can easily go higher than 8G on these systems as I have 32gig each node, but there was docs that said 8G is better for GC. Has anyone else taken a jmap dump of cassandra?
>
> Thanks,
> Dean
>