You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Adam Plumb <ap...@fiksu.com> on 2016/03/11 17:14:38 UTC

Cassandra causing OOM Killer to strike on new cluster running 3.4

I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
launched and loaded data into yesterday (roughly 2TB of total storage) and
am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
RAM and the heap size is set to 8G with a new heap size of 1.6G.

Last night I finished loading up the data, then ran an incremental repair
on one of the nodes just to ensure that everything was working (nodetool
repair).  Over night all 18 nodes ran out of memory and were killed by the
OOM killer.  I restarted them this morning and they all came up fine, but
just started churning through memory and got killed again.  I restarted
them again and they're doing the same thing.  I'm not getting any errors in
the system log, since the process is getting killed abruptly (which makes
me think this is a native memory issue, not heap)

Obviously this behavior isn't the best.  I'm willing to provide any data
people need to help debug this, these nodes are still up and running.  I'm
also in IRC if anyone wants to jump on there.

Here is the output of ps aux:

497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java -ea
> -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
> -Dcom.sun.management.jmxremote.port=7199
> -Dcom.sun.management.jmxremote.rmi.port=7199
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/*cas*
> sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/*cas*
> sandra/build/classes/thrift:/usr/local/*cas*
> sandra/lib/airline-0.6.jar:/usr/local/*cas*
> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-*cas*
> sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
> sandra-thrift-3.4.jar:/usr/local/*cas*sandra/lib/asm-5.0.4.jar:/usr/local/
> *cas*sandra/lib/*cas*sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
> sandra/lib/guava-18.0.jar:/usr/local/*cas*
> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
> sandra/lib/javax.inject.jar:/usr/local/*cas*
> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
> org.apache.*cas*sandra.service.CassandraDaemon


 Here is some dmesg output:

[40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
oom_score_adj=0
[40003.013042] java cpuset=/ mems_allowed=0
[40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
4.1.7-15.23.amzn1.x86_64 #1
[40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
[40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
0000000000000000
[40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
ffff8800ebaaba78
[40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
0000000000000080
[40003.028660] Call Trace:
[40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
[40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
[40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
[40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
[40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
[40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
[40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
[40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
[40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
[40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
[40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
[40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
[40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
[40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
[40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
[40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
[40003.062938] Mem-Info:
[40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
 active_file:51 inactive_file:8 isolated_file:0
 unevictable:2088582 dirty:0 writeback:0 unstable:0
 slab_reclaimable:82028 slab_unreclaimable:12209
 mapped:31065 shmem:20 pagetables:37089 bounce:0
 free:35830 free_pcp:3141 free_cma:0
[40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[40003.090267] lowmem_reserve[]: 0 3746 30128 30128
[40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB high:4132kB
active_anon:2400616kB inactive_anon:4060kB active_file:0kB
inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
[40003.108802] lowmem_reserve[]: 0 0 26382 26382
[40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
[40003.128300] lowmem_reserve[]: 0 0 0 0
[40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
(M) = 15872kB
[40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
(ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
[40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB (R)
3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB =
19360kB
[40003.148749] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0
hugepages_size=2048kB
[40003.151777] 31304 total pagecache pages
[40003.153288] 0 pages in swap cache
[40003.154528] Swap cache stats: add 0, delete 0, find 0/0
[40003.156377] Free swap  = 0kB
[40003.157423] Total swap = 0kB
[40003.158465] 7864221 pages RAM
[40003.159522] 0 pages HighMem/MovableOnly
[40003.160984] 146372 pages reserved
[40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
swapents oom_score_adj name
[40003.165398] [ 2560]     0  2560     2804      181      11       3
0         -1000 udevd
[40003.168638] [ 3976]     0  3976     2334      123       9       3
0             0 dhclient
[40003.171895] [ 4017]     0  4017    11626       89      23       4
0         -1000 auditd
[40003.175080] [ 4035]     0  4035    61861       99      23       3
0             0 rsyslogd
[40003.178198] [ 4046]     0  4046     3462       98      10       3
0             0 irqbalance
[40003.181559] [ 4052]     0  4052     1096       22       7       3
0             0 rngd
[40003.184683] [ 4067]    32  4067     8815       99      22       3
0             0 rpcbind
[40003.187772] [ 4084]    29  4084     9957      201      24       3
0             0 rpc.statd
[40003.191099] [ 4115]    81  4115     5442       60      15       3
0             0 dbus-daemon
[40003.194438] [ 4333]     0  4333    19452      522      40       3
0         -1000 sshd
[40003.197432] [ 4361]    38  4361     7321      562      19       3
0             0 ntpd
[40003.200609] [ 4376]     0  4376    22238      720      46       3
0             0 sendmail
[40003.203868] [ 4384]    51  4384    20103      674      41       3
0             0 sendmail
[40003.206963] [ 4515]     0  4515     4267       38      13       3
0             0 atd
[40003.210100] [ 6730]     0  6730    29888      547      13       3
0             0 crond
[40003.213267] [13533]   497 13533 47235415  7455314   36074     167
0             0 java
[40003.216364] [13674]   498 13674    49154     3168      51       3
0             0 supervisord
[40003.219721] [13680]   498 13680    51046     5350      69       3
0             0 python
[40003.222908] [13682]   498 13682    36172     5602      75       3
0             0 python
[40003.225952] [13683]   498 13683    32633     5319      68       3
0             0 python
[40003.229108] [13684]   498 13684    29577     5003      63       3
0             0 python
[40003.232263] [13719]   498 13719  1035920    41287     234       8
0             0 java
[40003.235287] [13753]   498 13753    34605     5645      70       3
0             0 python
[40003.238322] [14143]     0 14143     1615      420       9       3
0             0 agetty
[40003.241582] [14145]     0 14145     1078      377       8       3
0             0 mingetty
[40003.244752] [14147]     0 14147     1078      354       8       3
0             0 mingetty
[40003.247833] [14149]     0 14149     1078      373       8       3
0             0 mingetty
[40003.251193] [14151]     0 14151     1078      367       7       3
0             0 mingetty
[40003.254342] [14153]     0 14153     1078      348       8       3
0             0 mingetty
[40003.257443] [14154]     0 14154     2803      182      10       3
0         -1000 udevd
[40003.260688] [14155]     0 14155     2803      182      10       3
0         -1000 udevd
[40003.263782] [14157]     0 14157     1078      369       8       3
0             0 mingetty
[40003.266895] Out of memory: Kill process 13533 (java) score 970 or
sacrifice child
[40003.269702] Killed process 13533 (java) total-vm:188941660kB,
anon-rss:29710828kB, file-rss:110428kB

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Satoshi Hikida <sa...@imagine-orb.com>.
Hi, Paulo

Thank you for your help!

I'll try to use it.

Regards,
Satoshi

On Wed, Apr 20, 2016 at 8:43 PM, Paulo Motta <pa...@gmail.com>
wrote:

> 2.2.6 should be released in the next couple of weeks, but I also attached
> a patch file to the issue if you want to patch 2.2.5 manually.
>
> 2016-04-20 0:19 GMT-03:00 Satoshi Hikida <sa...@imagine-orb.com>:
>
>> Hi,
>>
>> I'm looking forward to a  patch (file) for this bug(CASSANDRA-11344) to
>> apply C* version 2.2.5. Is there available patch for that version? I
>> watched link(https://issues.apache.org/jira/browse/CASSANDRA-11344) but
>> couldn't find patch file or something like that.  Or is there any
>> workaround to avoid this bug with version 2.2.5?
>>
>> Regards,
>> Satoshi
>>
>> On Tue, Mar 15, 2016 at 4:47 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>
>>> OK so good news, I'm running with the patched jar file in my cluster and
>>> haven't seen any issues.  The bloom filter off-heap memory usage is between
>>> 1.5GB and 2GB per node, which is much more in-line with what I'm expecting!
>>>  (thumbsup)
>>>
>>> On Mon, Mar 14, 2016 at 9:42 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>
>>>> Thanks for the link!  Luckily the cluster I'm running is not yet in
>>>> production and running with dummy data so I will throw that jar on the
>>>> nodes and I'll let you know how things shake out.
>>>>
>>>> On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pauloricardomg@gmail.com
>>>> > wrote:
>>>>
>>>>> You could be hitting CASSANDRA-11344 (
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's
>>>>> the case, you may try to replace your cassandra jar on an affected node
>>>>> with a version with this fix in place and force bloom filter regeneration
>>>>> to see if if it fixes your problem. You can build with "ant jar" from this
>>>>> branch: https://github.com/pauloricardomg/cassandra/tree/3.4-11344
>>>>>
>>>>> You can force bloom filter regeneration by either removing your
>>>>> *Filter.db files (make sure to backup them before for safety) or changing
>>>>> the bloom_filter_fp_chance before restarting affected nodes with the fixed
>>>>> jar.
>>>>>
>>>>> 2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:
>>>>>
>>>>>> So it's looking like the bloom filter off heap memory usage is
>>>>>> ramping up and up until the OOM killer kills the java process.  I
>>>>>> relaunched on instances with 60GB of memory and the same thing is
>>>>>> happening.  A node will start using more and more RAM until the process is
>>>>>> killed, then another node will start using more and more until it is also
>>>>>> killed.
>>>>>>
>>>>>> Is this the expected behavior?  It doesn't seem ideal to me.  Is
>>>>>> there anything obvious that I'm doing wrong?
>>>>>>
>>>>>> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Here is the creation syntax for the entire schema.  The xyz table
>>>>>>> has about 2.1 billion keys and the def table has about 230 million keys.
>>>>>>> Max row size is about 3KB, mean row size is 700B.
>>>>>>>
>>>>>>> CREATE KEYSPACE abc WITH replication = {'class':
>>>>>>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>>>>>>> CREATE TABLE xyz (
>>>>>>>>   id text,
>>>>>>>>   secondary_id int,
>>>>>>>>   data text,
>>>>>>>>   PRIMARY KEY(id)
>>>>>>>> )
>>>>>>>>   WITH
>>>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>>>>>>> CREATE TABLE def (
>>>>>>>>   id text,
>>>>>>>>   secondary_id int,
>>>>>>>>   data text,
>>>>>>>>   PRIMARY KEY(id)
>>>>>>>> )
>>>>>>>>   WITH
>>>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>>
>>>>>>>> What is your schema and data like - in particular, how wide are
>>>>>>>> your partitions (number of rows and typical row size)?
>>>>>>>>
>>>>>>>> Maybe you just need (a lot) more heap for rows during the repair
>>>>>>>> process.
>>>>>>>>
>>>>>>>> -- Jack Krupansky
>>>>>>>>
>>>>>>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> These are brand new boxes only running Cassandra.  Yeah the kernel
>>>>>>>>> is what is killing the JVM, and this does appear to be a memory leak in
>>>>>>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>>>>>>> services needed for Amazon Linux to run.
>>>>>>>>>
>>>>>>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>>>>>>> sebastian.estevez@datastax.com> wrote:
>>>>>>>>>
>>>>>>>>>> Sacrifice child in dmesg is your OS killing the process with the
>>>>>>>>>> most ram. That means you're actually running out of memory at the Linux
>>>>>>>>>> level outside of the JVM.
>>>>>>>>>>
>>>>>>>>>> Are you running anything other than Cassandra on this box?
>>>>>>>>>>
>>>>>>>>>> If so, does it have a memory leak?
>>>>>>>>>>
>>>>>>>>>> all the best,
>>>>>>>>>>
>>>>>>>>>> Sebastián
>>>>>>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I
>>>>>>>>>>> just launched and loaded data into yesterday (roughly 2TB of total storage)
>>>>>>>>>>> and am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with
>>>>>>>>>>> 30GB RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>>>>>>
>>>>>>>>>>> Last night I finished loading up the data, then ran an
>>>>>>>>>>> incremental repair on one of the nodes just to ensure that everything was
>>>>>>>>>>> working (nodetool repair).  Over night all 18 nodes ran out of memory and
>>>>>>>>>>> were killed by the OOM killer.  I restarted them this morning and they all
>>>>>>>>>>> came up fine, but just started churning through memory and got killed
>>>>>>>>>>> again.  I restarted them again and they're doing the same thing.  I'm not
>>>>>>>>>>> getting any errors in the system log, since the process is getting killed
>>>>>>>>>>> abruptly (which makes me think this is a native memory issue, not heap)
>>>>>>>>>>>
>>>>>>>>>>> Obviously this behavior isn't the best.  I'm willing to provide
>>>>>>>>>>> any data people need to help debug this, these nodes are still up and
>>>>>>>>>>> running.  I'm also in IRC if anyone wants to jump on there.
>>>>>>>>>>>
>>>>>>>>>>> Here is the output of ps aux:
>>>>>>>>>>>
>>>>>>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13
>>>>>>>>>>>> 62:15 java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp
>>>>>>>>>>>> /usr/local/*cas*sandra/conf:/usr/local/*cas*
>>>>>>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/
>>>>>>>>>>>> *cas*sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/
>>>>>>>>>>>> *cas*sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  Here is some dmesg output:
>>>>>>>>>>>
>>>>>>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da,
>>>>>>>>>>> order=0, oom_score_adj=0
>>>>>>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G
>>>>>>>>>>>   E   4.1.7-15.23.amzn1.x86_64 #1
>>>>>>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon
>>>>>>>>>>> 12/07/2015
>>>>>>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18
>>>>>>>>>>> ffffffff814da12c 0000000000000000
>>>>>>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8
>>>>>>>>>>> ffffffff814d7939 ffff8800ebaaba78
>>>>>>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00
>>>>>>>>>>> ffff88077001c190 0000000000000080
>>>>>>>>>>> [40003.028660] Call Trace:
>>>>>>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ?
>>>>>>>>>>> mem_cgroup_iter+0x137/0x3d0
>>>>>>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>>>>>>> [40003.046105]  [<ffffffff811a48fa>]
>>>>>>>>>>> alloc_page_interleave+0x3a/0x90
>>>>>>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>>>>>>> [40003.050582]  [<ffffffff81188035>]
>>>>>>>>>>> handle_mm_fault+0x1355/0x1770
>>>>>>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>>>>>>> [40003.062938] Mem-Info:
>>>>>>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025
>>>>>>>>>>> isolated_anon:0
>>>>>>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB
>>>>>>>>>>> low:24264kB high:29116kB active_anon:19350996kB inactive_anon:40kB
>>>>>>>>>>> active_file:212kB inactive_file:80kB unevictable:7304596kB
>>>>>>>>>>> isolated(anon):0kB isolated(file):0kB present:27525120kB managed:27015196kB
>>>>>>>>>>> mlocked:7304596kB dirty:0kB writeback:0kB mapped:107696kB shmem:68kB
>>>>>>>>>>> slab_reclaimable:84228kB slab_unreclaimable:40004kB kernel_stack:10000kB
>>>>>>>>>>> pagetables:131824kB unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB
>>>>>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB
>>>>>>>>>>> (UE) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER)
>>>>>>>>>>> 2*4096kB (M) = 15872kB
>>>>>>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM)
>>>>>>>>>>> 714*16kB (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB
>>>>>>>>>>> (ER) 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R)
>>>>>>>>>>> 8*32kB (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>>>>>>> 0*4096kB = 19360kB
>>>>>>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>>>>>>> [40003.153288] 0 pages in swap cache
>>>>>>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>>>>>> [40003.156377] Free swap  = 0kB
>>>>>>>>>>> [40003.157423] Total swap = 0kB
>>>>>>>>>>> [40003.158465] 7864221 pages RAM
>>>>>>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>>>>>>> [40003.160984] 146372 pages reserved
>>>>>>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes
>>>>>>>>>>> nr_pmds swapents oom_score_adj name
>>>>>>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11
>>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9
>>>>>>>>>>>   3        0             0 dhclient
>>>>>>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23
>>>>>>>>>>>   4        0         -1000 auditd
>>>>>>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23
>>>>>>>>>>>   3        0             0 rsyslogd
>>>>>>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10
>>>>>>>>>>>   3        0             0 irqbalance
>>>>>>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7
>>>>>>>>>>>   3        0             0 rngd
>>>>>>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22
>>>>>>>>>>>   3        0             0 rpcbind
>>>>>>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24
>>>>>>>>>>>   3        0             0 rpc.statd
>>>>>>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15
>>>>>>>>>>>   3        0             0 dbus-daemon
>>>>>>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40
>>>>>>>>>>>   3        0         -1000 sshd
>>>>>>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19
>>>>>>>>>>>   3        0             0 ntpd
>>>>>>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46
>>>>>>>>>>>   3        0             0 sendmail
>>>>>>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41
>>>>>>>>>>>   3        0             0 sendmail
>>>>>>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13
>>>>>>>>>>>   3        0             0 atd
>>>>>>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13
>>>>>>>>>>>   3        0             0 crond
>>>>>>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074
>>>>>>>>>>> 167        0             0 java
>>>>>>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51
>>>>>>>>>>>   3        0             0 supervisord
>>>>>>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69
>>>>>>>>>>>   3        0             0 python
>>>>>>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75
>>>>>>>>>>>   3        0             0 python
>>>>>>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68
>>>>>>>>>>>   3        0             0 python
>>>>>>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63
>>>>>>>>>>>   3        0             0 python
>>>>>>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234
>>>>>>>>>>>   8        0             0 java
>>>>>>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70
>>>>>>>>>>>   3        0             0 python
>>>>>>>>>>> [40003.238322] [14143]     0 14143     1615      420       9
>>>>>>>>>>>   3        0             0 agetty
>>>>>>>>>>> [40003.241582] [14145]     0 14145     1078      377       8
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.244752] [14147]     0 14147     1078      354       8
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.247833] [14149]     0 14149     1078      373       8
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.251193] [14151]     0 14151     1078      367       7
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.254342] [14153]     0 14153     1078      348       8
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.257443] [14154]     0 14154     2803      182      10
>>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>>> [40003.260688] [14155]     0 14155     2803      182      10
>>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>>> [40003.263782] [14157]     0 14157     1078      369       8
>>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score
>>>>>>>>>>> 970 or sacrifice child
>>>>>>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Paulo Motta <pa...@gmail.com>.
2.2.6 should be released in the next couple of weeks, but I also attached a
patch file to the issue if you want to patch 2.2.5 manually.

2016-04-20 0:19 GMT-03:00 Satoshi Hikida <sa...@imagine-orb.com>:

> Hi,
>
> I'm looking forward to a  patch (file) for this bug(CASSANDRA-11344) to
> apply C* version 2.2.5. Is there available patch for that version? I
> watched link(https://issues.apache.org/jira/browse/CASSANDRA-11344) but
> couldn't find patch file or something like that.  Or is there any
> workaround to avoid this bug with version 2.2.5?
>
> Regards,
> Satoshi
>
> On Tue, Mar 15, 2016 at 4:47 AM, Adam Plumb <ap...@fiksu.com> wrote:
>
>> OK so good news, I'm running with the patched jar file in my cluster and
>> haven't seen any issues.  The bloom filter off-heap memory usage is between
>> 1.5GB and 2GB per node, which is much more in-line with what I'm expecting!
>>  (thumbsup)
>>
>> On Mon, Mar 14, 2016 at 9:42 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>
>>> Thanks for the link!  Luckily the cluster I'm running is not yet in
>>> production and running with dummy data so I will throw that jar on the
>>> nodes and I'll let you know how things shake out.
>>>
>>> On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pa...@gmail.com>
>>> wrote:
>>>
>>>> You could be hitting CASSANDRA-11344 (
>>>> https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's the
>>>> case, you may try to replace your cassandra jar on an affected node with a
>>>> version with this fix in place and force bloom filter regeneration to see
>>>> if if it fixes your problem. You can build with "ant jar" from this branch:
>>>> https://github.com/pauloricardomg/cassandra/tree/3.4-11344
>>>>
>>>> You can force bloom filter regeneration by either removing your
>>>> *Filter.db files (make sure to backup them before for safety) or changing
>>>> the bloom_filter_fp_chance before restarting affected nodes with the fixed
>>>> jar.
>>>>
>>>> 2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:
>>>>
>>>>> So it's looking like the bloom filter off heap memory usage is ramping
>>>>> up and up until the OOM killer kills the java process.  I relaunched on
>>>>> instances with 60GB of memory and the same thing is happening.  A node will
>>>>> start using more and more RAM until the process is killed, then another
>>>>> node will start using more and more until it is also killed.
>>>>>
>>>>> Is this the expected behavior?  It doesn't seem ideal to me.  Is there
>>>>> anything obvious that I'm doing wrong?
>>>>>
>>>>> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>>>
>>>>>> Here is the creation syntax for the entire schema.  The xyz table has
>>>>>> about 2.1 billion keys and the def table has about 230 million keys.  Max
>>>>>> row size is about 3KB, mean row size is 700B.
>>>>>>
>>>>>> CREATE KEYSPACE abc WITH replication = {'class':
>>>>>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>>>>>> CREATE TABLE xyz (
>>>>>>>   id text,
>>>>>>>   secondary_id int,
>>>>>>>   data text,
>>>>>>>   PRIMARY KEY(id)
>>>>>>> )
>>>>>>>   WITH
>>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>>>>>> CREATE TABLE def (
>>>>>>>   id text,
>>>>>>>   secondary_id int,
>>>>>>>   data text,
>>>>>>>   PRIMARY KEY(id)
>>>>>>> )
>>>>>>>   WITH
>>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>>>>>> jack.krupansky@gmail.com> wrote:
>>>>>>
>>>>>>> What is your schema and data like - in particular, how wide are your
>>>>>>> partitions (number of rows and typical row size)?
>>>>>>>
>>>>>>> Maybe you just need (a lot) more heap for rows during the repair
>>>>>>> process.
>>>>>>>
>>>>>>> -- Jack Krupansky
>>>>>>>
>>>>>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> These are brand new boxes only running Cassandra.  Yeah the kernel
>>>>>>>> is what is killing the JVM, and this does appear to be a memory leak in
>>>>>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>>>>>> services needed for Amazon Linux to run.
>>>>>>>>
>>>>>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>>>>>> sebastian.estevez@datastax.com> wrote:
>>>>>>>>
>>>>>>>>> Sacrifice child in dmesg is your OS killing the process with the
>>>>>>>>> most ram. That means you're actually running out of memory at the Linux
>>>>>>>>> level outside of the JVM.
>>>>>>>>>
>>>>>>>>> Are you running anything other than Cassandra on this box?
>>>>>>>>>
>>>>>>>>> If so, does it have a memory leak?
>>>>>>>>>
>>>>>>>>> all the best,
>>>>>>>>>
>>>>>>>>> Sebastián
>>>>>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>>>>>
>>>>>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I
>>>>>>>>>> just launched and loaded data into yesterday (roughly 2TB of total storage)
>>>>>>>>>> and am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with
>>>>>>>>>> 30GB RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>>>>>
>>>>>>>>>> Last night I finished loading up the data, then ran an
>>>>>>>>>> incremental repair on one of the nodes just to ensure that everything was
>>>>>>>>>> working (nodetool repair).  Over night all 18 nodes ran out of memory and
>>>>>>>>>> were killed by the OOM killer.  I restarted them this morning and they all
>>>>>>>>>> came up fine, but just started churning through memory and got killed
>>>>>>>>>> again.  I restarted them again and they're doing the same thing.  I'm not
>>>>>>>>>> getting any errors in the system log, since the process is getting killed
>>>>>>>>>> abruptly (which makes me think this is a native memory issue, not heap)
>>>>>>>>>>
>>>>>>>>>> Obviously this behavior isn't the best.  I'm willing to provide
>>>>>>>>>> any data people need to help debug this, these nodes are still up and
>>>>>>>>>> running.  I'm also in IRC if anyone wants to jump on there.
>>>>>>>>>>
>>>>>>>>>> Here is the output of ps aux:
>>>>>>>>>>
>>>>>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15
>>>>>>>>>>> java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp
>>>>>>>>>>> /usr/local/*cas*sandra/conf:/usr/local/*cas*
>>>>>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/
>>>>>>>>>>> *cas*sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/
>>>>>>>>>>> *cas*sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  Here is some dmesg output:
>>>>>>>>>>
>>>>>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da,
>>>>>>>>>> order=0, oom_score_adj=0
>>>>>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G
>>>>>>>>>> E   4.1.7-15.23.amzn1.x86_64 #1
>>>>>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon
>>>>>>>>>> 12/07/2015
>>>>>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18
>>>>>>>>>> ffffffff814da12c 0000000000000000
>>>>>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8
>>>>>>>>>> ffffffff814d7939 ffff8800ebaaba78
>>>>>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00
>>>>>>>>>> ffff88077001c190 0000000000000080
>>>>>>>>>> [40003.028660] Call Trace:
>>>>>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>>>>>> [40003.046105]  [<ffffffff811a48fa>]
>>>>>>>>>> alloc_page_interleave+0x3a/0x90
>>>>>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>>>>>> [40003.062938] Mem-Info:
>>>>>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025
>>>>>>>>>> isolated_anon:0
>>>>>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>>>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>>>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>>>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>>>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>>>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>>>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>>>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB
>>>>>>>>>> (UE) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER)
>>>>>>>>>> 2*4096kB (M) = 15872kB
>>>>>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB
>>>>>>>>>> (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER)
>>>>>>>>>> 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R)
>>>>>>>>>> 8*32kB (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>>>>>> 0*4096kB = 19360kB
>>>>>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>>>>>> [40003.153288] 0 pages in swap cache
>>>>>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>>>>> [40003.156377] Free swap  = 0kB
>>>>>>>>>> [40003.157423] Total swap = 0kB
>>>>>>>>>> [40003.158465] 7864221 pages RAM
>>>>>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>>>>>> [40003.160984] 146372 pages reserved
>>>>>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes
>>>>>>>>>> nr_pmds swapents oom_score_adj name
>>>>>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11
>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9
>>>>>>>>>>   3        0             0 dhclient
>>>>>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23
>>>>>>>>>>   4        0         -1000 auditd
>>>>>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23
>>>>>>>>>>   3        0             0 rsyslogd
>>>>>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10
>>>>>>>>>>   3        0             0 irqbalance
>>>>>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7
>>>>>>>>>>   3        0             0 rngd
>>>>>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22
>>>>>>>>>>   3        0             0 rpcbind
>>>>>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24
>>>>>>>>>>   3        0             0 rpc.statd
>>>>>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15
>>>>>>>>>>   3        0             0 dbus-daemon
>>>>>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40
>>>>>>>>>>   3        0         -1000 sshd
>>>>>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19
>>>>>>>>>>   3        0             0 ntpd
>>>>>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46
>>>>>>>>>>   3        0             0 sendmail
>>>>>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41
>>>>>>>>>>   3        0             0 sendmail
>>>>>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13
>>>>>>>>>>   3        0             0 atd
>>>>>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13
>>>>>>>>>>   3        0             0 crond
>>>>>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074
>>>>>>>>>> 167        0             0 java
>>>>>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51
>>>>>>>>>>   3        0             0 supervisord
>>>>>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69
>>>>>>>>>>   3        0             0 python
>>>>>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75
>>>>>>>>>>   3        0             0 python
>>>>>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68
>>>>>>>>>>   3        0             0 python
>>>>>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63
>>>>>>>>>>   3        0             0 python
>>>>>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234
>>>>>>>>>>   8        0             0 java
>>>>>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70
>>>>>>>>>>   3        0             0 python
>>>>>>>>>> [40003.238322] [14143]     0 14143     1615      420       9
>>>>>>>>>>   3        0             0 agetty
>>>>>>>>>> [40003.241582] [14145]     0 14145     1078      377       8
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.244752] [14147]     0 14147     1078      354       8
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.247833] [14149]     0 14149     1078      373       8
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.251193] [14151]     0 14151     1078      367       7
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.254342] [14153]     0 14153     1078      348       8
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.257443] [14154]     0 14154     2803      182      10
>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>> [40003.260688] [14155]     0 14155     2803      182      10
>>>>>>>>>>   3        0         -1000 udevd
>>>>>>>>>> [40003.263782] [14157]     0 14157     1078      369       8
>>>>>>>>>>   3        0             0 mingetty
>>>>>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970
>>>>>>>>>> or sacrifice child
>>>>>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Satoshi Hikida <sa...@imagine-orb.com>.
Hi,

I'm looking forward to a  patch (file) for this bug(CASSANDRA-11344) to
apply C* version 2.2.5. Is there available patch for that version? I
watched link(https://issues.apache.org/jira/browse/CASSANDRA-11344) but
couldn't find patch file or something like that.  Or is there any
workaround to avoid this bug with version 2.2.5?

Regards,
Satoshi

On Tue, Mar 15, 2016 at 4:47 AM, Adam Plumb <ap...@fiksu.com> wrote:

> OK so good news, I'm running with the patched jar file in my cluster and
> haven't seen any issues.  The bloom filter off-heap memory usage is between
> 1.5GB and 2GB per node, which is much more in-line with what I'm expecting!
>  (thumbsup)
>
> On Mon, Mar 14, 2016 at 9:42 AM, Adam Plumb <ap...@fiksu.com> wrote:
>
>> Thanks for the link!  Luckily the cluster I'm running is not yet in
>> production and running with dummy data so I will throw that jar on the
>> nodes and I'll let you know how things shake out.
>>
>> On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pa...@gmail.com>
>> wrote:
>>
>>> You could be hitting CASSANDRA-11344 (
>>> https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's the
>>> case, you may try to replace your cassandra jar on an affected node with a
>>> version with this fix in place and force bloom filter regeneration to see
>>> if if it fixes your problem. You can build with "ant jar" from this branch:
>>> https://github.com/pauloricardomg/cassandra/tree/3.4-11344
>>>
>>> You can force bloom filter regeneration by either removing your
>>> *Filter.db files (make sure to backup them before for safety) or changing
>>> the bloom_filter_fp_chance before restarting affected nodes with the fixed
>>> jar.
>>>
>>> 2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:
>>>
>>>> So it's looking like the bloom filter off heap memory usage is ramping
>>>> up and up until the OOM killer kills the java process.  I relaunched on
>>>> instances with 60GB of memory and the same thing is happening.  A node will
>>>> start using more and more RAM until the process is killed, then another
>>>> node will start using more and more until it is also killed.
>>>>
>>>> Is this the expected behavior?  It doesn't seem ideal to me.  Is there
>>>> anything obvious that I'm doing wrong?
>>>>
>>>> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>>
>>>>> Here is the creation syntax for the entire schema.  The xyz table has
>>>>> about 2.1 billion keys and the def table has about 230 million keys.  Max
>>>>> row size is about 3KB, mean row size is 700B.
>>>>>
>>>>> CREATE KEYSPACE abc WITH replication = {'class':
>>>>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>>>>> CREATE TABLE xyz (
>>>>>>   id text,
>>>>>>   secondary_id int,
>>>>>>   data text,
>>>>>>   PRIMARY KEY(id)
>>>>>> )
>>>>>>   WITH
>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>>>>> CREATE TABLE def (
>>>>>>   id text,
>>>>>>   secondary_id int,
>>>>>>   data text,
>>>>>>   PRIMARY KEY(id)
>>>>>> )
>>>>>>   WITH
>>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>>>>
>>>>>
>>>>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>>>>> jack.krupansky@gmail.com> wrote:
>>>>>
>>>>>> What is your schema and data like - in particular, how wide are your
>>>>>> partitions (number of rows and typical row size)?
>>>>>>
>>>>>> Maybe you just need (a lot) more heap for rows during the repair
>>>>>> process.
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com>
>>>>>> wrote:
>>>>>>
>>>>>>> These are brand new boxes only running Cassandra.  Yeah the kernel
>>>>>>> is what is killing the JVM, and this does appear to be a memory leak in
>>>>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>>>>> services needed for Amazon Linux to run.
>>>>>>>
>>>>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>>>>> sebastian.estevez@datastax.com> wrote:
>>>>>>>
>>>>>>>> Sacrifice child in dmesg is your OS killing the process with the
>>>>>>>> most ram. That means you're actually running out of memory at the Linux
>>>>>>>> level outside of the JVM.
>>>>>>>>
>>>>>>>> Are you running anything other than Cassandra on this box?
>>>>>>>>
>>>>>>>> If so, does it have a memory leak?
>>>>>>>>
>>>>>>>> all the best,
>>>>>>>>
>>>>>>>> Sebastián
>>>>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>>>>
>>>>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I
>>>>>>>>> just launched and loaded data into yesterday (roughly 2TB of total storage)
>>>>>>>>> and am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with
>>>>>>>>> 30GB RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>>>>
>>>>>>>>> Last night I finished loading up the data, then ran an incremental
>>>>>>>>> repair on one of the nodes just to ensure that everything was working
>>>>>>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>>>>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>>>>>>> up fine, but just started churning through memory and got killed again.  I
>>>>>>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>>>>>>> errors in the system log, since the process is getting killed abruptly
>>>>>>>>> (which makes me think this is a native memory issue, not heap)
>>>>>>>>>
>>>>>>>>> Obviously this behavior isn't the best.  I'm willing to provide
>>>>>>>>> any data people need to help debug this, these nodes are still up and
>>>>>>>>> running.  I'm also in IRC if anyone wants to jump on there.
>>>>>>>>>
>>>>>>>>> Here is the output of ps aux:
>>>>>>>>>
>>>>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15
>>>>>>>>>> java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp
>>>>>>>>>> /usr/local/*cas*sandra/conf:/usr/local/*cas*
>>>>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Here is some dmesg output:
>>>>>>>>>
>>>>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>>>>>>> oom_score_adj=0
>>>>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G
>>>>>>>>> E   4.1.7-15.23.amzn1.x86_64 #1
>>>>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon
>>>>>>>>> 12/07/2015
>>>>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>>>>>>> 0000000000000000
>>>>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>>>>>>> ffff8800ebaaba78
>>>>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>>>>>>> 0000000000000080
>>>>>>>>> [40003.028660] Call Trace:
>>>>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>>>>> [40003.046105]  [<ffffffff811a48fa>]
>>>>>>>>> alloc_page_interleave+0x3a/0x90
>>>>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>>>>> [40003.062938] Mem-Info:
>>>>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025
>>>>>>>>> isolated_anon:0
>>>>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB
>>>>>>>>> (UE) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER)
>>>>>>>>> 2*4096kB (M) = 15872kB
>>>>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB
>>>>>>>>> (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER)
>>>>>>>>> 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R)
>>>>>>>>> 8*32kB (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>>>>> 0*4096kB = 19360kB
>>>>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>>>>> [40003.153288] 0 pages in swap cache
>>>>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>>>> [40003.156377] Free swap  = 0kB
>>>>>>>>> [40003.157423] Total swap = 0kB
>>>>>>>>> [40003.158465] 7864221 pages RAM
>>>>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>>>>> [40003.160984] 146372 pages reserved
>>>>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes
>>>>>>>>> nr_pmds swapents oom_score_adj name
>>>>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11
>>>>>>>>> 3        0         -1000 udevd
>>>>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9
>>>>>>>>> 3        0             0 dhclient
>>>>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23
>>>>>>>>> 4        0         -1000 auditd
>>>>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23
>>>>>>>>> 3        0             0 rsyslogd
>>>>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10
>>>>>>>>> 3        0             0 irqbalance
>>>>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7
>>>>>>>>> 3        0             0 rngd
>>>>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22
>>>>>>>>> 3        0             0 rpcbind
>>>>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24
>>>>>>>>> 3        0             0 rpc.statd
>>>>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15
>>>>>>>>> 3        0             0 dbus-daemon
>>>>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40
>>>>>>>>> 3        0         -1000 sshd
>>>>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19
>>>>>>>>> 3        0             0 ntpd
>>>>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46
>>>>>>>>> 3        0             0 sendmail
>>>>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41
>>>>>>>>> 3        0             0 sendmail
>>>>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13
>>>>>>>>> 3        0             0 atd
>>>>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13
>>>>>>>>> 3        0             0 crond
>>>>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074
>>>>>>>>> 167        0             0 java
>>>>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51
>>>>>>>>> 3        0             0 supervisord
>>>>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69
>>>>>>>>> 3        0             0 python
>>>>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75
>>>>>>>>> 3        0             0 python
>>>>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68
>>>>>>>>> 3        0             0 python
>>>>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63
>>>>>>>>> 3        0             0 python
>>>>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234
>>>>>>>>> 8        0             0 java
>>>>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70
>>>>>>>>> 3        0             0 python
>>>>>>>>> [40003.238322] [14143]     0 14143     1615      420       9
>>>>>>>>> 3        0             0 agetty
>>>>>>>>> [40003.241582] [14145]     0 14145     1078      377       8
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.244752] [14147]     0 14147     1078      354       8
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.247833] [14149]     0 14149     1078      373       8
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.251193] [14151]     0 14151     1078      367       7
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.254342] [14153]     0 14153     1078      348       8
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.257443] [14154]     0 14154     2803      182      10
>>>>>>>>> 3        0         -1000 udevd
>>>>>>>>> [40003.260688] [14155]     0 14155     2803      182      10
>>>>>>>>> 3        0         -1000 udevd
>>>>>>>>> [40003.263782] [14157]     0 14157     1078      369       8
>>>>>>>>> 3        0             0 mingetty
>>>>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970
>>>>>>>>> or sacrifice child
>>>>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Adam Plumb <ap...@fiksu.com>.
OK so good news, I'm running with the patched jar file in my cluster and
haven't seen any issues.  The bloom filter off-heap memory usage is between
1.5GB and 2GB per node, which is much more in-line with what I'm expecting!
 (thumbsup)

On Mon, Mar 14, 2016 at 9:42 AM, Adam Plumb <ap...@fiksu.com> wrote:

> Thanks for the link!  Luckily the cluster I'm running is not yet in
> production and running with dummy data so I will throw that jar on the
> nodes and I'll let you know how things shake out.
>
> On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pa...@gmail.com>
> wrote:
>
>> You could be hitting CASSANDRA-11344 (
>> https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's the
>> case, you may try to replace your cassandra jar on an affected node with a
>> version with this fix in place and force bloom filter regeneration to see
>> if if it fixes your problem. You can build with "ant jar" from this branch:
>> https://github.com/pauloricardomg/cassandra/tree/3.4-11344
>>
>> You can force bloom filter regeneration by either removing your
>> *Filter.db files (make sure to backup them before for safety) or changing
>> the bloom_filter_fp_chance before restarting affected nodes with the fixed
>> jar.
>>
>> 2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:
>>
>>> So it's looking like the bloom filter off heap memory usage is ramping
>>> up and up until the OOM killer kills the java process.  I relaunched on
>>> instances with 60GB of memory and the same thing is happening.  A node will
>>> start using more and more RAM until the process is killed, then another
>>> node will start using more and more until it is also killed.
>>>
>>> Is this the expected behavior?  It doesn't seem ideal to me.  Is there
>>> anything obvious that I'm doing wrong?
>>>
>>> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>
>>>> Here is the creation syntax for the entire schema.  The xyz table has
>>>> about 2.1 billion keys and the def table has about 230 million keys.  Max
>>>> row size is about 3KB, mean row size is 700B.
>>>>
>>>> CREATE KEYSPACE abc WITH replication = {'class':
>>>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>>>> CREATE TABLE xyz (
>>>>>   id text,
>>>>>   secondary_id int,
>>>>>   data text,
>>>>>   PRIMARY KEY(id)
>>>>> )
>>>>>   WITH
>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>>>> CREATE TABLE def (
>>>>>   id text,
>>>>>   secondary_id int,
>>>>>   data text,
>>>>>   PRIMARY KEY(id)
>>>>> )
>>>>>   WITH
>>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>>   and compression = {'class': 'LZ4Compressor'};
>>>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>>>
>>>>
>>>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>>>> jack.krupansky@gmail.com> wrote:
>>>>
>>>>> What is your schema and data like - in particular, how wide are your
>>>>> partitions (number of rows and typical row size)?
>>>>>
>>>>> Maybe you just need (a lot) more heap for rows during the repair
>>>>> process.
>>>>>
>>>>> -- Jack Krupansky
>>>>>
>>>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>>>
>>>>>> These are brand new boxes only running Cassandra.  Yeah the kernel is
>>>>>> what is killing the JVM, and this does appear to be a memory leak in
>>>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>>>> services needed for Amazon Linux to run.
>>>>>>
>>>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>>>> sebastian.estevez@datastax.com> wrote:
>>>>>>
>>>>>>> Sacrifice child in dmesg is your OS killing the process with the
>>>>>>> most ram. That means you're actually running out of memory at the Linux
>>>>>>> level outside of the JVM.
>>>>>>>
>>>>>>> Are you running anything other than Cassandra on this box?
>>>>>>>
>>>>>>> If so, does it have a memory leak?
>>>>>>>
>>>>>>> all the best,
>>>>>>>
>>>>>>> Sebastián
>>>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>>>
>>>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I
>>>>>>>> just launched and loaded data into yesterday (roughly 2TB of total storage)
>>>>>>>> and am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with
>>>>>>>> 30GB RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>>>
>>>>>>>> Last night I finished loading up the data, then ran an incremental
>>>>>>>> repair on one of the nodes just to ensure that everything was working
>>>>>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>>>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>>>>>> up fine, but just started churning through memory and got killed again.  I
>>>>>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>>>>>> errors in the system log, since the process is getting killed abruptly
>>>>>>>> (which makes me think this is a native memory issue, not heap)
>>>>>>>>
>>>>>>>> Obviously this behavior isn't the best.  I'm willing to provide any
>>>>>>>> data people need to help debug this, these nodes are still up and running.
>>>>>>>> I'm also in IRC if anyone wants to jump on there.
>>>>>>>>
>>>>>>>> Here is the output of ps aux:
>>>>>>>>
>>>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15
>>>>>>>>> java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp
>>>>>>>>> /usr/local/*cas*sandra/conf:/usr/local/*cas*
>>>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>>>
>>>>>>>>
>>>>>>>>  Here is some dmesg output:
>>>>>>>>
>>>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>>>>>> oom_score_adj=0
>>>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>>>>>>>   4.1.7-15.23.amzn1.x86_64 #1
>>>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon
>>>>>>>> 12/07/2015
>>>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>>>>>> 0000000000000000
>>>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>>>>>> ffff8800ebaaba78
>>>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>>>>>> 0000000000000080
>>>>>>>> [40003.028660] Call Trace:
>>>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>>>> [40003.062938] Mem-Info:
>>>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025
>>>>>>>> isolated_anon:0
>>>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB
>>>>>>>> (UE) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER)
>>>>>>>> 2*4096kB (M) = 15872kB
>>>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB
>>>>>>>> (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER)
>>>>>>>> 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB
>>>>>>>> (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>>>> 0*4096kB = 19360kB
>>>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>>>> [40003.153288] 0 pages in swap cache
>>>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>>> [40003.156377] Free swap  = 0kB
>>>>>>>> [40003.157423] Total swap = 0kB
>>>>>>>> [40003.158465] 7864221 pages RAM
>>>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>>>> [40003.160984] 146372 pages reserved
>>>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes
>>>>>>>> nr_pmds swapents oom_score_adj name
>>>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11
>>>>>>>> 3        0         -1000 udevd
>>>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9
>>>>>>>> 3        0             0 dhclient
>>>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23
>>>>>>>> 4        0         -1000 auditd
>>>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23
>>>>>>>> 3        0             0 rsyslogd
>>>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10
>>>>>>>> 3        0             0 irqbalance
>>>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7
>>>>>>>> 3        0             0 rngd
>>>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22
>>>>>>>> 3        0             0 rpcbind
>>>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24
>>>>>>>> 3        0             0 rpc.statd
>>>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15
>>>>>>>> 3        0             0 dbus-daemon
>>>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40
>>>>>>>> 3        0         -1000 sshd
>>>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19
>>>>>>>> 3        0             0 ntpd
>>>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46
>>>>>>>> 3        0             0 sendmail
>>>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41
>>>>>>>> 3        0             0 sendmail
>>>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13
>>>>>>>> 3        0             0 atd
>>>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13
>>>>>>>> 3        0             0 crond
>>>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074
>>>>>>>> 167        0             0 java
>>>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51
>>>>>>>> 3        0             0 supervisord
>>>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69
>>>>>>>> 3        0             0 python
>>>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75
>>>>>>>> 3        0             0 python
>>>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68
>>>>>>>> 3        0             0 python
>>>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63
>>>>>>>> 3        0             0 python
>>>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234
>>>>>>>> 8        0             0 java
>>>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70
>>>>>>>> 3        0             0 python
>>>>>>>> [40003.238322] [14143]     0 14143     1615      420       9
>>>>>>>> 3        0             0 agetty
>>>>>>>> [40003.241582] [14145]     0 14145     1078      377       8
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.244752] [14147]     0 14147     1078      354       8
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.247833] [14149]     0 14149     1078      373       8
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.251193] [14151]     0 14151     1078      367       7
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.254342] [14153]     0 14153     1078      348       8
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.257443] [14154]     0 14154     2803      182      10
>>>>>>>> 3        0         -1000 udevd
>>>>>>>> [40003.260688] [14155]     0 14155     2803      182      10
>>>>>>>> 3        0         -1000 udevd
>>>>>>>> [40003.263782] [14157]     0 14157     1078      369       8
>>>>>>>> 3        0             0 mingetty
>>>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970
>>>>>>>> or sacrifice child
>>>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Adam Plumb <ap...@fiksu.com>.
Thanks for the link!  Luckily the cluster I'm running is not yet in
production and running with dummy data so I will throw that jar on the
nodes and I'll let you know how things shake out.

On Sun, Mar 13, 2016 at 11:02 PM, Paulo Motta <pa...@gmail.com>
wrote:

> You could be hitting CASSANDRA-11344 (
> https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's the
> case, you may try to replace your cassandra jar on an affected node with a
> version with this fix in place and force bloom filter regeneration to see
> if if it fixes your problem. You can build with "ant jar" from this branch:
> https://github.com/pauloricardomg/cassandra/tree/3.4-11344
>
> You can force bloom filter regeneration by either removing your *Filter.db
> files (make sure to backup them before for safety) or changing the
> bloom_filter_fp_chance before restarting affected nodes with the fixed jar.
>
> 2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:
>
>> So it's looking like the bloom filter off heap memory usage is ramping up
>> and up until the OOM killer kills the java process.  I relaunched on
>> instances with 60GB of memory and the same thing is happening.  A node will
>> start using more and more RAM until the process is killed, then another
>> node will start using more and more until it is also killed.
>>
>> Is this the expected behavior?  It doesn't seem ideal to me.  Is there
>> anything obvious that I'm doing wrong?
>>
>> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>
>>> Here is the creation syntax for the entire schema.  The xyz table has
>>> about 2.1 billion keys and the def table has about 230 million keys.  Max
>>> row size is about 3KB, mean row size is 700B.
>>>
>>> CREATE KEYSPACE abc WITH replication = {'class':
>>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>>> CREATE TABLE xyz (
>>>>   id text,
>>>>   secondary_id int,
>>>>   data text,
>>>>   PRIMARY KEY(id)
>>>> )
>>>>   WITH
>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>   and compression = {'class': 'LZ4Compressor'};
>>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>>> CREATE TABLE def (
>>>>   id text,
>>>>   secondary_id int,
>>>>   data text,
>>>>   PRIMARY KEY(id)
>>>> )
>>>>   WITH
>>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>>   and compression = {'class': 'LZ4Compressor'};
>>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>>
>>>
>>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>>> jack.krupansky@gmail.com> wrote:
>>>
>>>> What is your schema and data like - in particular, how wide are your
>>>> partitions (number of rows and typical row size)?
>>>>
>>>> Maybe you just need (a lot) more heap for rows during the repair
>>>> process.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>>
>>>>> These are brand new boxes only running Cassandra.  Yeah the kernel is
>>>>> what is killing the JVM, and this does appear to be a memory leak in
>>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>>> services needed for Amazon Linux to run.
>>>>>
>>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>>> sebastian.estevez@datastax.com> wrote:
>>>>>
>>>>>> Sacrifice child in dmesg is your OS killing the process with the most
>>>>>> ram. That means you're actually running out of memory at the Linux level
>>>>>> outside of the JVM.
>>>>>>
>>>>>> Are you running anything other than Cassandra on this box?
>>>>>>
>>>>>> If so, does it have a memory leak?
>>>>>>
>>>>>> all the best,
>>>>>>
>>>>>> Sebastián
>>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>>
>>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>>>>>>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>>>>>>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>>>>>>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>>
>>>>>>> Last night I finished loading up the data, then ran an incremental
>>>>>>> repair on one of the nodes just to ensure that everything was working
>>>>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>>>>> up fine, but just started churning through memory and got killed again.  I
>>>>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>>>>> errors in the system log, since the process is getting killed abruptly
>>>>>>> (which makes me think this is a native memory issue, not heap)
>>>>>>>
>>>>>>> Obviously this behavior isn't the best.  I'm willing to provide any
>>>>>>> data people need to help debug this, these nodes are still up and running.
>>>>>>> I'm also in IRC if anyone wants to jump on there.
>>>>>>>
>>>>>>> Here is the output of ps aux:
>>>>>>>
>>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15
>>>>>>>> java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/
>>>>>>>> *cas*sandra/conf:/usr/local/*cas*
>>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>>
>>>>>>>
>>>>>>>  Here is some dmesg output:
>>>>>>>
>>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>>>>> oom_score_adj=0
>>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>>>>>>   4.1.7-15.23.amzn1.x86_64 #1
>>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon
>>>>>>> 12/07/2015
>>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>>>>> 0000000000000000
>>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>>>>> ffff8800ebaaba78
>>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>>>>> 0000000000000080
>>>>>>> [40003.028660] Call Trace:
>>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>>> [40003.062938] Mem-Info:
>>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB
>>>>>>> (UE) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER)
>>>>>>> 2*4096kB (M) = 15872kB
>>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB
>>>>>>> (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER)
>>>>>>> 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB
>>>>>>> (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>>> 0*4096kB = 19360kB
>>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>>> [40003.153288] 0 pages in swap cache
>>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>> [40003.156377] Free swap  = 0kB
>>>>>>> [40003.157423] Total swap = 0kB
>>>>>>> [40003.158465] 7864221 pages RAM
>>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>>> [40003.160984] 146372 pages reserved
>>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>>>>>>> swapents oom_score_adj name
>>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11
>>>>>>> 3        0         -1000 udevd
>>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9
>>>>>>> 3        0             0 dhclient
>>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23
>>>>>>> 4        0         -1000 auditd
>>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23
>>>>>>> 3        0             0 rsyslogd
>>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10
>>>>>>> 3        0             0 irqbalance
>>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7
>>>>>>> 3        0             0 rngd
>>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22
>>>>>>> 3        0             0 rpcbind
>>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24
>>>>>>> 3        0             0 rpc.statd
>>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15
>>>>>>> 3        0             0 dbus-daemon
>>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40
>>>>>>> 3        0         -1000 sshd
>>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19
>>>>>>> 3        0             0 ntpd
>>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46
>>>>>>> 3        0             0 sendmail
>>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41
>>>>>>> 3        0             0 sendmail
>>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13
>>>>>>> 3        0             0 atd
>>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13
>>>>>>> 3        0             0 crond
>>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074
>>>>>>> 167        0             0 java
>>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51
>>>>>>> 3        0             0 supervisord
>>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69
>>>>>>> 3        0             0 python
>>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75
>>>>>>> 3        0             0 python
>>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68
>>>>>>> 3        0             0 python
>>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63
>>>>>>> 3        0             0 python
>>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234
>>>>>>> 8        0             0 java
>>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70
>>>>>>> 3        0             0 python
>>>>>>> [40003.238322] [14143]     0 14143     1615      420       9
>>>>>>> 3        0             0 agetty
>>>>>>> [40003.241582] [14145]     0 14145     1078      377       8
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.244752] [14147]     0 14147     1078      354       8
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.247833] [14149]     0 14149     1078      373       8
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.251193] [14151]     0 14151     1078      367       7
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.254342] [14153]     0 14153     1078      348       8
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.257443] [14154]     0 14154     2803      182      10
>>>>>>> 3        0         -1000 udevd
>>>>>>> [40003.260688] [14155]     0 14155     2803      182      10
>>>>>>> 3        0         -1000 udevd
>>>>>>> [40003.263782] [14157]     0 14157     1078      369       8
>>>>>>> 3        0             0 mingetty
>>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>>>>>>> sacrifice child
>>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Paulo Motta <pa...@gmail.com>.
You could be hitting CASSANDRA-11344 (
https://issues.apache.org/jira/browse/CASSANDRA-11344).  If that's the
case, you may try to replace your cassandra jar on an affected node with a
version with this fix in place and force bloom filter regeneration to see
if if it fixes your problem. You can build with "ant jar" from this branch:
https://github.com/pauloricardomg/cassandra/tree/3.4-11344

You can force bloom filter regeneration by either removing your *Filter.db
files (make sure to backup them before for safety) or changing the
bloom_filter_fp_chance before restarting affected nodes with the fixed jar.

2016-03-13 19:51 GMT-03:00 Adam Plumb <ap...@fiksu.com>:

> So it's looking like the bloom filter off heap memory usage is ramping up
> and up until the OOM killer kills the java process.  I relaunched on
> instances with 60GB of memory and the same thing is happening.  A node will
> start using more and more RAM until the process is killed, then another
> node will start using more and more until it is also killed.
>
> Is this the expected behavior?  It doesn't seem ideal to me.  Is there
> anything obvious that I'm doing wrong?
>
> On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:
>
>> Here is the creation syntax for the entire schema.  The xyz table has
>> about 2.1 billion keys and the def table has about 230 million keys.  Max
>> row size is about 3KB, mean row size is 700B.
>>
>> CREATE KEYSPACE abc WITH replication = {'class':
>>> 'NetworkTopologyStrategy', 'us-east': 3};
>>> CREATE TABLE xyz (
>>>   id text,
>>>   secondary_id int,
>>>   data text,
>>>   PRIMARY KEY(id)
>>> )
>>>   WITH
>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>   and compression = {'class': 'LZ4Compressor'};
>>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>>> CREATE TABLE def (
>>>   id text,
>>>   secondary_id int,
>>>   data text,
>>>   PRIMARY KEY(id)
>>> )
>>>   WITH
>>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>>   and compression = {'class': 'LZ4Compressor'};
>>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>>
>>
>> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <
>> jack.krupansky@gmail.com> wrote:
>>
>>> What is your schema and data like - in particular, how wide are your
>>> partitions (number of rows and typical row size)?
>>>
>>> Maybe you just need (a lot) more heap for rows during the repair process.
>>>
>>> -- Jack Krupansky
>>>
>>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>>
>>>> These are brand new boxes only running Cassandra.  Yeah the kernel is
>>>> what is killing the JVM, and this does appear to be a memory leak in
>>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>>> services needed for Amazon Linux to run.
>>>>
>>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>>> sebastian.estevez@datastax.com> wrote:
>>>>
>>>>> Sacrifice child in dmesg is your OS killing the process with the most
>>>>> ram. That means you're actually running out of memory at the Linux level
>>>>> outside of the JVM.
>>>>>
>>>>> Are you running anything other than Cassandra on this box?
>>>>>
>>>>> If so, does it have a memory leak?
>>>>>
>>>>> all the best,
>>>>>
>>>>> Sebastián
>>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>>
>>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>>>>>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>>>>>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>>>>>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>>
>>>>>> Last night I finished loading up the data, then ran an incremental
>>>>>> repair on one of the nodes just to ensure that everything was working
>>>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>>>> up fine, but just started churning through memory and got killed again.  I
>>>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>>>> errors in the system log, since the process is getting killed abruptly
>>>>>> (which makes me think this is a native memory issue, not heap)
>>>>>>
>>>>>> Obviously this behavior isn't the best.  I'm willing to provide any
>>>>>> data people need to help debug this, these nodes are still up and running.
>>>>>> I'm also in IRC if anyone wants to jump on there.
>>>>>>
>>>>>> Here is the output of ps aux:
>>>>>>
>>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15
>>>>>>> java -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/
>>>>>>> *cas*sandra/conf:/usr/local/*cas*
>>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/apache-*cas*sandra-3.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/apache-*cas*sandra-clientutil-3.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/apache-*cas*sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>>
>>>>>>
>>>>>>  Here is some dmesg output:
>>>>>>
>>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>>>> oom_score_adj=0
>>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>>>>> 4.1.7-15.23.amzn1.x86_64 #1
>>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
>>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>>>> 0000000000000000
>>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>>>> ffff8800ebaaba78
>>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>>>> 0000000000000080
>>>>>> [40003.028660] Call Trace:
>>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>>> [40003.043857]  [<ffffffff81162a79>]
>>>>>> __alloc_pages_nodemask+0x8a9/0x8d0
>>>>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>>> [40003.062938] Mem-Info:
>>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
>>>>>> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
>>>>>> (M) = 15872kB
>>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB
>>>>>> (UE) 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER)
>>>>>> 10*512kB (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB
>>>>>> (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>>> 0*4096kB = 19360kB
>>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>>> [40003.151777] 31304 total pagecache pages
>>>>>> [40003.153288] 0 pages in swap cache
>>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>>> [40003.156377] Free swap  = 0kB
>>>>>> [40003.157423] Total swap = 0kB
>>>>>> [40003.158465] 7864221 pages RAM
>>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>>> [40003.160984] 146372 pages reserved
>>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>>>>>> swapents oom_score_adj name
>>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>>>>>>       0         -1000 udevd
>>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>>>>>>       0             0 dhclient
>>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>>>>>>       0         -1000 auditd
>>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>>>>>>       0             0 rsyslogd
>>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>>>>>>       0             0 irqbalance
>>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>>>>>>       0             0 rngd
>>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>>>>>>       0             0 rpcbind
>>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>>>>>>       0             0 rpc.statd
>>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>>>>>>       0             0 dbus-daemon
>>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>>>>>>       0         -1000 sshd
>>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>>>>>>       0             0 ntpd
>>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>>>>>>       0             0 sendmail
>>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>>>>>>       0             0 sendmail
>>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>>>>>>       0             0 atd
>>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>>>>>>       0             0 crond
>>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>>>>>>       0             0 java
>>>>>> [40003.216364] [13674]   498 13674    49154     3168      51       3
>>>>>>       0             0 supervisord
>>>>>> [40003.219721] [13680]   498 13680    51046     5350      69       3
>>>>>>       0             0 python
>>>>>> [40003.222908] [13682]   498 13682    36172     5602      75       3
>>>>>>       0             0 python
>>>>>> [40003.225952] [13683]   498 13683    32633     5319      68       3
>>>>>>       0             0 python
>>>>>> [40003.229108] [13684]   498 13684    29577     5003      63       3
>>>>>>       0             0 python
>>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>>>>>>       0             0 java
>>>>>> [40003.235287] [13753]   498 13753    34605     5645      70       3
>>>>>>       0             0 python
>>>>>> [40003.238322] [14143]     0 14143     1615      420       9       3
>>>>>>       0             0 agetty
>>>>>> [40003.241582] [14145]     0 14145     1078      377       8       3
>>>>>>       0             0 mingetty
>>>>>> [40003.244752] [14147]     0 14147     1078      354       8       3
>>>>>>       0             0 mingetty
>>>>>> [40003.247833] [14149]     0 14149     1078      373       8       3
>>>>>>       0             0 mingetty
>>>>>> [40003.251193] [14151]     0 14151     1078      367       7       3
>>>>>>       0             0 mingetty
>>>>>> [40003.254342] [14153]     0 14153     1078      348       8       3
>>>>>>       0             0 mingetty
>>>>>> [40003.257443] [14154]     0 14154     2803      182      10       3
>>>>>>       0         -1000 udevd
>>>>>> [40003.260688] [14155]     0 14155     2803      182      10       3
>>>>>>       0         -1000 udevd
>>>>>> [40003.263782] [14157]     0 14157     1078      369       8       3
>>>>>>       0             0 mingetty
>>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>>>>>> sacrifice child
>>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Adam Plumb <ap...@fiksu.com>.
So it's looking like the bloom filter off heap memory usage is ramping up
and up until the OOM killer kills the java process.  I relaunched on
instances with 60GB of memory and the same thing is happening.  A node will
start using more and more RAM until the process is killed, then another
node will start using more and more until it is also killed.

Is this the expected behavior?  It doesn't seem ideal to me.  Is there
anything obvious that I'm doing wrong?

On Fri, Mar 11, 2016 at 11:31 AM, Adam Plumb <ap...@fiksu.com> wrote:

> Here is the creation syntax for the entire schema.  The xyz table has
> about 2.1 billion keys and the def table has about 230 million keys.  Max
> row size is about 3KB, mean row size is 700B.
>
> CREATE KEYSPACE abc WITH replication = {'class':
>> 'NetworkTopologyStrategy', 'us-east': 3};
>> CREATE TABLE xyz (
>>   id text,
>>   secondary_id int,
>>   data text,
>>   PRIMARY KEY(id)
>> )
>>   WITH
>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>   and compression = {'class': 'LZ4Compressor'};
>> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
>> CREATE TABLE def (
>>   id text,
>>   secondary_id int,
>>   data text,
>>   PRIMARY KEY(id)
>> )
>>   WITH
>>   compaction = { 'class': 'LeveledCompactionStrategy' }
>>   and compression = {'class': 'LZ4Compressor'};
>> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);
>
>
> On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <jack.krupansky@gmail.com
> > wrote:
>
>> What is your schema and data like - in particular, how wide are your
>> partitions (number of rows and typical row size)?
>>
>> Maybe you just need (a lot) more heap for rows during the repair process.
>>
>> -- Jack Krupansky
>>
>> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:
>>
>>> These are brand new boxes only running Cassandra.  Yeah the kernel is
>>> what is killing the JVM, and this does appear to be a memory leak in
>>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>>> services needed for Amazon Linux to run.
>>>
>>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>>> sebastian.estevez@datastax.com> wrote:
>>>
>>>> Sacrifice child in dmesg is your OS killing the process with the most
>>>> ram. That means you're actually running out of memory at the Linux level
>>>> outside of the JVM.
>>>>
>>>> Are you running anything other than Cassandra on this box?
>>>>
>>>> If so, does it have a memory leak?
>>>>
>>>> all the best,
>>>>
>>>> Sebastián
>>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>>
>>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>>>>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>>>>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>>>>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>>
>>>>> Last night I finished loading up the data, then ran an incremental
>>>>> repair on one of the nodes just to ensure that everything was working
>>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>>> up fine, but just started churning through memory and got killed again.  I
>>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>>> errors in the system log, since the process is getting killed abruptly
>>>>> (which makes me think this is a native memory issue, not heap)
>>>>>
>>>>> Obviously this behavior isn't the best.  I'm willing to provide any
>>>>> data people need to help debug this, these nodes are still up and running.
>>>>> I'm also in IRC if anyone wants to jump on there.
>>>>>
>>>>> Here is the output of ps aux:
>>>>>
>>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java
>>>>>> -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/
>>>>>> *cas*sandra/conf:/usr/local/*cas*
>>>>>> sandra/build/classes/main:/usr/local/*cas*
>>>>>> sandra/build/classes/thrift:/usr/local/*cas*
>>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>>>>>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>>>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>>>> sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>>
>>>>>
>>>>>  Here is some dmesg output:
>>>>>
>>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>>> oom_score_adj=0
>>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>>>> 4.1.7-15.23.amzn1.x86_64 #1
>>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
>>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>>> 0000000000000000
>>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>>> ffff8800ebaaba78
>>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>>> 0000000000000080
>>>>> [40003.028660] Call Trace:
>>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>>> [40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
>>>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>>> [40003.062938] Mem-Info:
>>>>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>>  free:35830 free_pcp:3141 free_cma:0
>>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
>>>>> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
>>>>> (M) = 15872kB
>>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
>>>>> 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
>>>>> (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB
>>>>> (R) 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R)
>>>>> 0*4096kB = 19360kB
>>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>>> hugepages_surp=0 hugepages_size=2048kB
>>>>> [40003.151777] 31304 total pagecache pages
>>>>> [40003.153288] 0 pages in swap cache
>>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>>> [40003.156377] Free swap  = 0kB
>>>>> [40003.157423] Total swap = 0kB
>>>>> [40003.158465] 7864221 pages RAM
>>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>>> [40003.160984] 146372 pages reserved
>>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>>>>> swapents oom_score_adj name
>>>>> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>>>>>       0         -1000 udevd
>>>>> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>>>>>       0             0 dhclient
>>>>> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>>>>>       0         -1000 auditd
>>>>> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>>>>>       0             0 rsyslogd
>>>>> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>>>>>       0             0 irqbalance
>>>>> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>>>>>       0             0 rngd
>>>>> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>>>>>       0             0 rpcbind
>>>>> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>>>>>       0             0 rpc.statd
>>>>> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>>>>>       0             0 dbus-daemon
>>>>> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>>>>>       0         -1000 sshd
>>>>> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>>>>>       0             0 ntpd
>>>>> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>>>>>       0             0 sendmail
>>>>> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>>>>>       0             0 sendmail
>>>>> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>>>>>       0             0 atd
>>>>> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>>>>>       0             0 crond
>>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>>>>>       0             0 java
>>>>> [40003.216364] [13674]   498 13674    49154     3168      51       3
>>>>>       0             0 supervisord
>>>>> [40003.219721] [13680]   498 13680    51046     5350      69       3
>>>>>       0             0 python
>>>>> [40003.222908] [13682]   498 13682    36172     5602      75       3
>>>>>       0             0 python
>>>>> [40003.225952] [13683]   498 13683    32633     5319      68       3
>>>>>       0             0 python
>>>>> [40003.229108] [13684]   498 13684    29577     5003      63       3
>>>>>       0             0 python
>>>>> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>>>>>       0             0 java
>>>>> [40003.235287] [13753]   498 13753    34605     5645      70       3
>>>>>       0             0 python
>>>>> [40003.238322] [14143]     0 14143     1615      420       9       3
>>>>>       0             0 agetty
>>>>> [40003.241582] [14145]     0 14145     1078      377       8       3
>>>>>       0             0 mingetty
>>>>> [40003.244752] [14147]     0 14147     1078      354       8       3
>>>>>       0             0 mingetty
>>>>> [40003.247833] [14149]     0 14149     1078      373       8       3
>>>>>       0             0 mingetty
>>>>> [40003.251193] [14151]     0 14151     1078      367       7       3
>>>>>       0             0 mingetty
>>>>> [40003.254342] [14153]     0 14153     1078      348       8       3
>>>>>       0             0 mingetty
>>>>> [40003.257443] [14154]     0 14154     2803      182      10       3
>>>>>       0         -1000 udevd
>>>>> [40003.260688] [14155]     0 14155     2803      182      10       3
>>>>>       0         -1000 udevd
>>>>> [40003.263782] [14157]     0 14157     1078      369       8       3
>>>>>       0             0 mingetty
>>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>>>>> sacrifice child
>>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>>
>>>>>
>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Adam Plumb <ap...@fiksu.com>.
Here is the creation syntax for the entire schema.  The xyz table has about
2.1 billion keys and the def table has about 230 million keys.  Max row
size is about 3KB, mean row size is 700B.

CREATE KEYSPACE abc WITH replication = {'class': 'NetworkTopologyStrategy',
> 'us-east': 3};
> CREATE TABLE xyz (
>   id text,
>   secondary_id int,
>   data text,
>   PRIMARY KEY(id)
> )
>   WITH
>   compaction = { 'class': 'LeveledCompactionStrategy' }
>   and compression = {'class': 'LZ4Compressor'};
> CREATE INDEX secondary_id_index ON abc.xyz (secondary_id);
> CREATE TABLE def (
>   id text,
>   secondary_id int,
>   data text,
>   PRIMARY KEY(id)
> )
>   WITH
>   compaction = { 'class': 'LeveledCompactionStrategy' }
>   and compression = {'class': 'LZ4Compressor'};
> CREATE INDEX secondary_id_index_def ON abc.def (secondary_id);


On Fri, Mar 11, 2016 at 11:24 AM, Jack Krupansky <ja...@gmail.com>
wrote:

> What is your schema and data like - in particular, how wide are your
> partitions (number of rows and typical row size)?
>
> Maybe you just need (a lot) more heap for rows during the repair process.
>
> -- Jack Krupansky
>
> On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:
>
>> These are brand new boxes only running Cassandra.  Yeah the kernel is
>> what is killing the JVM, and this does appear to be a memory leak in
>> Cassandra.  And Cassandra is the only thing running, aside from the basic
>> services needed for Amazon Linux to run.
>>
>> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
>> sebastian.estevez@datastax.com> wrote:
>>
>>> Sacrifice child in dmesg is your OS killing the process with the most
>>> ram. That means you're actually running out of memory at the Linux level
>>> outside of the JVM.
>>>
>>> Are you running anything other than Cassandra on this box?
>>>
>>> If so, does it have a memory leak?
>>>
>>> all the best,
>>>
>>> Sebastián
>>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>>
>>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>>>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>>>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>>>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>>
>>>> Last night I finished loading up the data, then ran an incremental
>>>> repair on one of the nodes just to ensure that everything was working
>>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>>> killed by the OOM killer.  I restarted them this morning and they all came
>>>> up fine, but just started churning through memory and got killed again.  I
>>>> restarted them again and they're doing the same thing.  I'm not getting any
>>>> errors in the system log, since the process is getting killed abruptly
>>>> (which makes me think this is a native memory issue, not heap)
>>>>
>>>> Obviously this behavior isn't the best.  I'm willing to provide any
>>>> data people need to help debug this, these nodes are still up and running.
>>>> I'm also in IRC if anyone wants to jump on there.
>>>>
>>>> Here is the output of ps aux:
>>>>
>>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java
>>>>> -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>>> -Dcom.sun.management.jmxremote.port=7199
>>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/
>>>>> *cas*sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/
>>>>> *cas*sandra/build/classes/thrift:/usr/local/*cas*
>>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>>>>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>>> sandra-thrift-3.4.jar:/usr/local/*cas*
>>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>>
>>>>
>>>>  Here is some dmesg output:
>>>>
>>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>>> oom_score_adj=0
>>>> [40003.013042] java cpuset=/ mems_allowed=0
>>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>>> 4.1.7-15.23.amzn1.x86_64 #1
>>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
>>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>>> 0000000000000000
>>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>>> ffff8800ebaaba78
>>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>>> 0000000000000080
>>>> [40003.028660] Call Trace:
>>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>>> [40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
>>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>>> [40003.062938] Mem-Info:
>>>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>>>  active_file:51 inactive_file:8 isolated_file:0
>>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>>  free:35830 free_pcp:3141 free_cma:0
>>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
>>>> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
>>>> (M) = 15872kB
>>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
>>>> 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
>>>> (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB (R)
>>>> 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB =
>>>> 19360kB
>>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>>> hugepages_surp=0 hugepages_size=2048kB
>>>> [40003.151777] 31304 total pagecache pages
>>>> [40003.153288] 0 pages in swap cache
>>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>>> [40003.156377] Free swap  = 0kB
>>>> [40003.157423] Total swap = 0kB
>>>> [40003.158465] 7864221 pages RAM
>>>> [40003.159522] 0 pages HighMem/MovableOnly
>>>> [40003.160984] 146372 pages reserved
>>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>>>> swapents oom_score_adj name
>>>> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>>>>     0         -1000 udevd
>>>> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>>>>     0             0 dhclient
>>>> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>>>>     0         -1000 auditd
>>>> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>>>>     0             0 rsyslogd
>>>> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>>>>     0             0 irqbalance
>>>> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>>>>     0             0 rngd
>>>> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>>>>     0             0 rpcbind
>>>> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>>>>     0             0 rpc.statd
>>>> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>>>>     0             0 dbus-daemon
>>>> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>>>>     0         -1000 sshd
>>>> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>>>>     0             0 ntpd
>>>> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>>>>     0             0 sendmail
>>>> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>>>>     0             0 sendmail
>>>> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>>>>     0             0 atd
>>>> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>>>>     0             0 crond
>>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>>>>     0             0 java
>>>> [40003.216364] [13674]   498 13674    49154     3168      51       3
>>>>     0             0 supervisord
>>>> [40003.219721] [13680]   498 13680    51046     5350      69       3
>>>>     0             0 python
>>>> [40003.222908] [13682]   498 13682    36172     5602      75       3
>>>>     0             0 python
>>>> [40003.225952] [13683]   498 13683    32633     5319      68       3
>>>>     0             0 python
>>>> [40003.229108] [13684]   498 13684    29577     5003      63       3
>>>>     0             0 python
>>>> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>>>>     0             0 java
>>>> [40003.235287] [13753]   498 13753    34605     5645      70       3
>>>>     0             0 python
>>>> [40003.238322] [14143]     0 14143     1615      420       9       3
>>>>     0             0 agetty
>>>> [40003.241582] [14145]     0 14145     1078      377       8       3
>>>>     0             0 mingetty
>>>> [40003.244752] [14147]     0 14147     1078      354       8       3
>>>>     0             0 mingetty
>>>> [40003.247833] [14149]     0 14149     1078      373       8       3
>>>>     0             0 mingetty
>>>> [40003.251193] [14151]     0 14151     1078      367       7       3
>>>>     0             0 mingetty
>>>> [40003.254342] [14153]     0 14153     1078      348       8       3
>>>>     0             0 mingetty
>>>> [40003.257443] [14154]     0 14154     2803      182      10       3
>>>>     0         -1000 udevd
>>>> [40003.260688] [14155]     0 14155     2803      182      10       3
>>>>     0         -1000 udevd
>>>> [40003.263782] [14157]     0 14157     1078      369       8       3
>>>>     0             0 mingetty
>>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>>>> sacrifice child
>>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>>> anon-rss:29710828kB, file-rss:110428kB
>>>>
>>>>
>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Jack Krupansky <ja...@gmail.com>.
What is your schema and data like - in particular, how wide are your
partitions (number of rows and typical row size)?

Maybe you just need (a lot) more heap for rows during the repair process.

-- Jack Krupansky

On Fri, Mar 11, 2016 at 11:19 AM, Adam Plumb <ap...@fiksu.com> wrote:

> These are brand new boxes only running Cassandra.  Yeah the kernel is what
> is killing the JVM, and this does appear to be a memory leak in Cassandra.
> And Cassandra is the only thing running, aside from the basic services
> needed for Amazon Linux to run.
>
> On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
> sebastian.estevez@datastax.com> wrote:
>
>> Sacrifice child in dmesg is your OS killing the process with the most
>> ram. That means you're actually running out of memory at the Linux level
>> outside of the JVM.
>>
>> Are you running anything other than Cassandra on this box?
>>
>> If so, does it have a memory leak?
>>
>> all the best,
>>
>> Sebastián
>> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>>
>>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>>
>>> Last night I finished loading up the data, then ran an incremental
>>> repair on one of the nodes just to ensure that everything was working
>>> (nodetool repair).  Over night all 18 nodes ran out of memory and were
>>> killed by the OOM killer.  I restarted them this morning and they all came
>>> up fine, but just started churning through memory and got killed again.  I
>>> restarted them again and they're doing the same thing.  I'm not getting any
>>> errors in the system log, since the process is getting killed abruptly
>>> (which makes me think this is a native memory issue, not heap)
>>>
>>> Obviously this behavior isn't the best.  I'm willing to provide any data
>>> people need to help debug this, these nodes are still up and running.  I'm
>>> also in IRC if anyone wants to jump on there.
>>>
>>> Here is the output of ps aux:
>>>
>>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java
>>>> -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>>> -Dcom.sun.management.jmxremote.port=7199
>>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>> -Dcom.sun.management.jmxremote.authenticate=false
>>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/
>>>> *cas*sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/
>>>> *cas*sandra/build/classes/thrift:/usr/local/*cas*
>>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>>>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>>> sandra-thrift-3.4.jar:/usr/local/*cas*
>>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>>> org.apache.*cas*sandra.service.CassandraDaemon
>>>
>>>
>>>  Here is some dmesg output:
>>>
>>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>>> oom_score_adj=0
>>> [40003.013042] java cpuset=/ mems_allowed=0
>>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>>> 4.1.7-15.23.amzn1.x86_64 #1
>>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
>>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>>> 0000000000000000
>>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>>> ffff8800ebaaba78
>>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>>> 0000000000000080
>>> [40003.028660] Call Trace:
>>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>>> [40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
>>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>>> [40003.062938] Mem-Info:
>>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>>  active_file:51 inactive_file:8 isolated_file:0
>>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>>  free:35830 free_pcp:3141 free_cma:0
>>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
>>> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
>>> (M) = 15872kB
>>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
>>> 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
>>> (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB (R)
>>> 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB =
>>> 19360kB
>>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0
>>> hugepages_surp=0 hugepages_size=2048kB
>>> [40003.151777] 31304 total pagecache pages
>>> [40003.153288] 0 pages in swap cache
>>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>>> [40003.156377] Free swap  = 0kB
>>> [40003.157423] Total swap = 0kB
>>> [40003.158465] 7864221 pages RAM
>>> [40003.159522] 0 pages HighMem/MovableOnly
>>> [40003.160984] 146372 pages reserved
>>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>>> swapents oom_score_adj name
>>> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>>>     0         -1000 udevd
>>> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>>>     0             0 dhclient
>>> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>>>     0         -1000 auditd
>>> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>>>     0             0 rsyslogd
>>> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>>>     0             0 irqbalance
>>> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>>>     0             0 rngd
>>> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>>>     0             0 rpcbind
>>> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>>>     0             0 rpc.statd
>>> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>>>     0             0 dbus-daemon
>>> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>>>     0         -1000 sshd
>>> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>>>     0             0 ntpd
>>> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>>>     0             0 sendmail
>>> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>>>     0             0 sendmail
>>> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>>>     0             0 atd
>>> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>>>     0             0 crond
>>> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>>>     0             0 java
>>> [40003.216364] [13674]   498 13674    49154     3168      51       3
>>>     0             0 supervisord
>>> [40003.219721] [13680]   498 13680    51046     5350      69       3
>>>     0             0 python
>>> [40003.222908] [13682]   498 13682    36172     5602      75       3
>>>     0             0 python
>>> [40003.225952] [13683]   498 13683    32633     5319      68       3
>>>     0             0 python
>>> [40003.229108] [13684]   498 13684    29577     5003      63       3
>>>     0             0 python
>>> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>>>     0             0 java
>>> [40003.235287] [13753]   498 13753    34605     5645      70       3
>>>     0             0 python
>>> [40003.238322] [14143]     0 14143     1615      420       9       3
>>>     0             0 agetty
>>> [40003.241582] [14145]     0 14145     1078      377       8       3
>>>     0             0 mingetty
>>> [40003.244752] [14147]     0 14147     1078      354       8       3
>>>     0             0 mingetty
>>> [40003.247833] [14149]     0 14149     1078      373       8       3
>>>     0             0 mingetty
>>> [40003.251193] [14151]     0 14151     1078      367       7       3
>>>     0             0 mingetty
>>> [40003.254342] [14153]     0 14153     1078      348       8       3
>>>     0             0 mingetty
>>> [40003.257443] [14154]     0 14154     2803      182      10       3
>>>     0         -1000 udevd
>>> [40003.260688] [14155]     0 14155     2803      182      10       3
>>>     0         -1000 udevd
>>> [40003.263782] [14157]     0 14157     1078      369       8       3
>>>     0             0 mingetty
>>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>>> sacrifice child
>>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>>> anon-rss:29710828kB, file-rss:110428kB
>>>
>>>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Adam Plumb <ap...@fiksu.com>.
These are brand new boxes only running Cassandra.  Yeah the kernel is what
is killing the JVM, and this does appear to be a memory leak in Cassandra.
And Cassandra is the only thing running, aside from the basic services
needed for Amazon Linux to run.

On Fri, Mar 11, 2016 at 11:17 AM, Sebastian Estevez <
sebastian.estevez@datastax.com> wrote:

> Sacrifice child in dmesg is your OS killing the process with the most ram.
> That means you're actually running out of memory at the Linux level outside
> of the JVM.
>
> Are you running anything other than Cassandra on this box?
>
> If so, does it have a memory leak?
>
> all the best,
>
> Sebastián
> On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:
>
>> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
>> launched and loaded data into yesterday (roughly 2TB of total storage) and
>> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
>> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>>
>> Last night I finished loading up the data, then ran an incremental repair
>> on one of the nodes just to ensure that everything was working (nodetool
>> repair).  Over night all 18 nodes ran out of memory and were killed by the
>> OOM killer.  I restarted them this morning and they all came up fine, but
>> just started churning through memory and got killed again.  I restarted
>> them again and they're doing the same thing.  I'm not getting any errors in
>> the system log, since the process is getting killed abruptly (which makes
>> me think this is a native memory issue, not heap)
>>
>> Obviously this behavior isn't the best.  I'm willing to provide any data
>> people need to help debug this, these nodes are still up and running.  I'm
>> also in IRC if anyone wants to jump on there.
>>
>> Here is the output of ps aux:
>>
>> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java
>>> -ea -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>>> -Dcom.sun.management.jmxremote.port=7199
>>> -Dcom.sun.management.jmxremote.rmi.port=7199
>>> -Dcom.sun.management.jmxremote.ssl=false
>>> -Dcom.sun.management.jmxremote.authenticate=false
>>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/*cas*
>>> sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/*cas*
>>> sandra/build/classes/thrift:/usr/local/*cas*
>>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>>> sandra-thrift-3.4.jar:/usr/local/*cas*
>>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>>> org.apache.*cas*sandra.service.CassandraDaemon
>>
>>
>>  Here is some dmesg output:
>>
>> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
>> oom_score_adj=0
>> [40003.013042] java cpuset=/ mems_allowed=0
>> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
>> 4.1.7-15.23.amzn1.x86_64 #1
>> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
>> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
>> 0000000000000000
>> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
>> ffff8800ebaaba78
>> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
>> 0000000000000080
>> [40003.028660] Call Trace:
>> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
>> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
>> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
>> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
>> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
>> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
>> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
>> [40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
>> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
>> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
>> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
>> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
>> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
>> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
>> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
>> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
>> [40003.062938] Mem-Info:
>> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>>  active_file:51 inactive_file:8 isolated_file:0
>>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>>  slab_reclaimable:82028 slab_unreclaimable:12209
>>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>>  free:35830 free_pcp:3141 free_cma:0
>> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
>> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
>> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
>> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
>> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
>> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
>> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
>> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
>> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
>> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
>> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
>> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
>> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
>> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
>> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
>> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
>> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
>> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
>> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
>> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
>> [40003.128300] lowmem_reserve[]: 0 0 0 0
>> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
>> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
>> (M) = 15872kB
>> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
>> 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
>> (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
>> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB (R)
>> 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB =
>> 19360kB
>> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0
>> hugepages_size=2048kB
>> [40003.151777] 31304 total pagecache pages
>> [40003.153288] 0 pages in swap cache
>> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
>> [40003.156377] Free swap  = 0kB
>> [40003.157423] Total swap = 0kB
>> [40003.158465] 7864221 pages RAM
>> [40003.159522] 0 pages HighMem/MovableOnly
>> [40003.160984] 146372 pages reserved
>> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
>> swapents oom_score_adj name
>> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>>   0         -1000 udevd
>> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>>   0             0 dhclient
>> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>>   0         -1000 auditd
>> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>>   0             0 rsyslogd
>> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>>   0             0 irqbalance
>> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>>   0             0 rngd
>> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>>   0             0 rpcbind
>> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>>   0             0 rpc.statd
>> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>>   0             0 dbus-daemon
>> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>>   0         -1000 sshd
>> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>>   0             0 ntpd
>> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>>   0             0 sendmail
>> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>>   0             0 sendmail
>> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>>   0             0 atd
>> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>>   0             0 crond
>> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>>   0             0 java
>> [40003.216364] [13674]   498 13674    49154     3168      51       3
>>   0             0 supervisord
>> [40003.219721] [13680]   498 13680    51046     5350      69       3
>>   0             0 python
>> [40003.222908] [13682]   498 13682    36172     5602      75       3
>>   0             0 python
>> [40003.225952] [13683]   498 13683    32633     5319      68       3
>>   0             0 python
>> [40003.229108] [13684]   498 13684    29577     5003      63       3
>>   0             0 python
>> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>>   0             0 java
>> [40003.235287] [13753]   498 13753    34605     5645      70       3
>>   0             0 python
>> [40003.238322] [14143]     0 14143     1615      420       9       3
>>   0             0 agetty
>> [40003.241582] [14145]     0 14145     1078      377       8       3
>>   0             0 mingetty
>> [40003.244752] [14147]     0 14147     1078      354       8       3
>>   0             0 mingetty
>> [40003.247833] [14149]     0 14149     1078      373       8       3
>>   0             0 mingetty
>> [40003.251193] [14151]     0 14151     1078      367       7       3
>>   0             0 mingetty
>> [40003.254342] [14153]     0 14153     1078      348       8       3
>>   0             0 mingetty
>> [40003.257443] [14154]     0 14154     2803      182      10       3
>>   0         -1000 udevd
>> [40003.260688] [14155]     0 14155     2803      182      10       3
>>   0         -1000 udevd
>> [40003.263782] [14157]     0 14157     1078      369       8       3
>>   0             0 mingetty
>> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
>> sacrifice child
>> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
>> anon-rss:29710828kB, file-rss:110428kB
>>
>>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

Posted by Sebastian Estevez <se...@datastax.com>.
Sacrifice child in dmesg is your OS killing the process with the most ram.
That means you're actually running out of memory at the Linux level outside
of the JVM.

Are you running anything other than Cassandra on this box?

If so, does it have a memory leak?

all the best,

Sebastián
On Mar 11, 2016 11:14 AM, "Adam Plumb" <ap...@fiksu.com> wrote:

> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
> launched and loaded data into yesterday (roughly 2TB of total storage) and
> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>
> Last night I finished loading up the data, then ran an incremental repair
> on one of the nodes just to ensure that everything was working (nodetool
> repair).  Over night all 18 nodes ran out of memory and were killed by the
> OOM killer.  I restarted them this morning and they all came up fine, but
> just started churning through memory and got killed again.  I restarted
> them again and they're doing the same thing.  I'm not getting any errors in
> the system log, since the process is getting killed abruptly (which makes
> me think this is a native memory issue, not heap)
>
> Obviously this behavior isn't the best.  I'm willing to provide any data
> people need to help debug this, these nodes are still up and running.  I'm
> also in IRC if anyone wants to jump on there.
>
> Here is the output of ps aux:
>
> 497       64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java -ea
>> -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003
>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>> -Dcom.sun.management.jmxremote.port=7199
>> -Dcom.sun.management.jmxremote.rmi.port=7199
>> -Dcom.sun.management.jmxremote.ssl=false
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/*cas*
>> sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/*cas*
>> sandra/build/classes/thrift:/usr/local/*cas*
>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>> sandra-thrift-3.4.jar:/usr/local/*cas*
>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>> sandra/lib/sigar-1.6.4.jar:/usr/local/*cas*
>> sandra/lib/slf4j-api-1.7.7.jar:/usr/local/*cas*
>> sandra/lib/snakeyaml-1.11.jar:/usr/local/*cas*
>> sandra/lib/snappy-java-1.1.1.7.jar:/usr/local/*cas*
>> sandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/local/*cas*
>> sandra/lib/ST4-4.0.8.jar:/usr/local/*cas*
>> sandra/lib/stream-2.5.2.jar:/usr/local/*cas*
>> sandra/lib/thrift-server-0.3.7.jar:/usr/local/*cas*sandra/lib/jsr223/*/*.jar
>> org.apache.*cas*sandra.service.CassandraDaemon
>
>
>  Here is some dmesg output:
>
> [40003.010117] java invoked oom-killer: gfp_mask=0x280da, order=0,
> oom_score_adj=0
> [40003.013042] java cpuset=/ mems_allowed=0
> [40003.014789] CPU: 3 PID: 37757 Comm: java Tainted: G            E
> 4.1.7-15.23.amzn1.x86_64 #1
> [40003.017852] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
> [40003.020066]  0000000000000000 ffff8800ebaaba18 ffffffff814da12c
> 0000000000000000
> [40003.022870]  ffff880763594c80 ffff8800ebaabac8 ffffffff814d7939
> ffff8800ebaaba78
> [40003.025674]  ffffffff811bf8f7 ffff880770679c00 ffff88077001c190
> 0000000000000080
> [40003.028660] Call Trace:
> [40003.029613]  [<ffffffff814da12c>] dump_stack+0x45/0x57
> [40003.031486]  [<ffffffff814d7939>] dump_header+0x7f/0x1fe
> [40003.033390]  [<ffffffff811bf8f7>] ? mem_cgroup_iter+0x137/0x3d0
> [40003.035475]  [<ffffffff8107f496>] ? __queue_work+0x136/0x320
> [40003.037594]  [<ffffffff8115d11c>] oom_kill_process+0x1cc/0x3b0
> [40003.039825]  [<ffffffff8115d67e>] __out_of_memory+0x31e/0x530
> [40003.041938]  [<ffffffff8115da2b>] out_of_memory+0x5b/0x80
> [40003.043857]  [<ffffffff81162a79>] __alloc_pages_nodemask+0x8a9/0x8d0
> [40003.046105]  [<ffffffff811a48fa>] alloc_page_interleave+0x3a/0x90
> [40003.048419]  [<ffffffff811a79c3>] alloc_pages_vma+0x143/0x200
> [40003.050582]  [<ffffffff81188035>] handle_mm_fault+0x1355/0x1770
> [40003.052674]  [<ffffffff8118e4c5>] ? do_mmap_pgoff+0x2f5/0x3c0
> [40003.054737]  [<ffffffff8105dafc>] __do_page_fault+0x17c/0x420
> [40003.056858]  [<ffffffff8118c976>] ? SyS_mmap_pgoff+0x116/0x270
> [40003.059082]  [<ffffffff8105ddc2>] do_page_fault+0x22/0x30
> [40003.061084]  [<ffffffff814e2ad8>] page_fault+0x28/0x30
> [40003.062938] Mem-Info:
> [40003.063762] active_anon:5437903 inactive_anon:1025 isolated_anon:0
>  active_file:51 inactive_file:8 isolated_file:0
>  unevictable:2088582 dirty:0 writeback:0 unstable:0
>  slab_reclaimable:82028 slab_unreclaimable:12209
>  mapped:31065 shmem:20 pagetables:37089 bounce:0
>  free:35830 free_pcp:3141 free_cma:0
> [40003.075549] Node 0 DMA free:15872kB min:8kB low:8kB high:12kB
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB
> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
> slab_reclaimable:32kB slab_unreclaimable:0kB kernel_stack:0kB
> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> [40003.090267] lowmem_reserve[]: 0 3746 30128 30128
> [40003.092182] Node 0 DMA32 free:108236kB min:2756kB low:3444kB
> high:4132kB active_anon:2400616kB inactive_anon:4060kB active_file:0kB
> inactive_file:0kB unevictable:1049732kB isolated(anon):0kB
> isolated(file):0kB present:3915776kB managed:3840296kB mlocked:1049732kB
> dirty:4kB writeback:0kB mapped:16564kB shmem:12kB slab_reclaimable:243852kB
> slab_unreclaimable:8832kB kernel_stack:1152kB pagetables:16532kB
> unstable:0kB bounce:0kB free_pcp:5716kB local_pcp:220kB free_cma:0kB
> writeback_tmp:0kB pages_scanned:5408 all_unreclaimable? yes
> [40003.108802] lowmem_reserve[]: 0 0 26382 26382
> [40003.110578] Node 0 Normal free:19212kB min:19412kB low:24264kB
> high:29116kB active_anon:19350996kB inactive_anon:40kB active_file:212kB
> inactive_file:80kB unevictable:7304596kB isolated(anon):0kB
> isolated(file):0kB present:27525120kB managed:27015196kB mlocked:7304596kB
> dirty:0kB writeback:0kB mapped:107696kB shmem:68kB slab_reclaimable:84228kB
> slab_unreclaimable:40004kB kernel_stack:10000kB pagetables:131824kB
> unstable:0kB bounce:0kB free_pcp:6848kB local_pcp:692kB free_cma:0kB
> writeback_tmp:0kB pages_scanned:38332 all_unreclaimable? yes
> [40003.128300] lowmem_reserve[]: 0 0 0 0
> [40003.129844] Node 0 DMA: 0*4kB 0*8kB 0*16kB 2*32kB (UE) 3*64kB (UE)
> 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (ER) 2*4096kB
> (M) = 15872kB
> [40003.135917] Node 0 DMA32: 193*4kB (UEM) 254*8kB (UEM) 714*16kB (UE)
> 1344*32kB (UEMR) 249*64kB (UEMR) 120*128kB (UER) 53*256kB (ER) 10*512kB
> (ER) 1*1024kB (E) 0*2048kB 0*4096kB = 108244kB
> [40003.142956] Node 0 Normal: 3956*4kB (UE) 0*8kB 1*16kB (R) 8*32kB (R)
> 3*64kB (R) 2*128kB (R) 3*256kB (R) 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB =
> 19360kB
> [40003.148749] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0
> hugepages_size=2048kB
> [40003.151777] 31304 total pagecache pages
> [40003.153288] 0 pages in swap cache
> [40003.154528] Swap cache stats: add 0, delete 0, find 0/0
> [40003.156377] Free swap  = 0kB
> [40003.157423] Total swap = 0kB
> [40003.158465] 7864221 pages RAM
> [40003.159522] 0 pages HighMem/MovableOnly
> [40003.160984] 146372 pages reserved
> [40003.162244] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds
> swapents oom_score_adj name
> [40003.165398] [ 2560]     0  2560     2804      181      11       3
>   0         -1000 udevd
> [40003.168638] [ 3976]     0  3976     2334      123       9       3
>   0             0 dhclient
> [40003.171895] [ 4017]     0  4017    11626       89      23       4
>   0         -1000 auditd
> [40003.175080] [ 4035]     0  4035    61861       99      23       3
>   0             0 rsyslogd
> [40003.178198] [ 4046]     0  4046     3462       98      10       3
>   0             0 irqbalance
> [40003.181559] [ 4052]     0  4052     1096       22       7       3
>   0             0 rngd
> [40003.184683] [ 4067]    32  4067     8815       99      22       3
>   0             0 rpcbind
> [40003.187772] [ 4084]    29  4084     9957      201      24       3
>   0             0 rpc.statd
> [40003.191099] [ 4115]    81  4115     5442       60      15       3
>   0             0 dbus-daemon
> [40003.194438] [ 4333]     0  4333    19452      522      40       3
>   0         -1000 sshd
> [40003.197432] [ 4361]    38  4361     7321      562      19       3
>   0             0 ntpd
> [40003.200609] [ 4376]     0  4376    22238      720      46       3
>   0             0 sendmail
> [40003.203868] [ 4384]    51  4384    20103      674      41       3
>   0             0 sendmail
> [40003.206963] [ 4515]     0  4515     4267       38      13       3
>   0             0 atd
> [40003.210100] [ 6730]     0  6730    29888      547      13       3
>   0             0 crond
> [40003.213267] [13533]   497 13533 47235415  7455314   36074     167
>   0             0 java
> [40003.216364] [13674]   498 13674    49154     3168      51       3
>   0             0 supervisord
> [40003.219721] [13680]   498 13680    51046     5350      69       3
>   0             0 python
> [40003.222908] [13682]   498 13682    36172     5602      75       3
>   0             0 python
> [40003.225952] [13683]   498 13683    32633     5319      68       3
>   0             0 python
> [40003.229108] [13684]   498 13684    29577     5003      63       3
>   0             0 python
> [40003.232263] [13719]   498 13719  1035920    41287     234       8
>   0             0 java
> [40003.235287] [13753]   498 13753    34605     5645      70       3
>   0             0 python
> [40003.238322] [14143]     0 14143     1615      420       9       3
>   0             0 agetty
> [40003.241582] [14145]     0 14145     1078      377       8       3
>   0             0 mingetty
> [40003.244752] [14147]     0 14147     1078      354       8       3
>   0             0 mingetty
> [40003.247833] [14149]     0 14149     1078      373       8       3
>   0             0 mingetty
> [40003.251193] [14151]     0 14151     1078      367       7       3
>   0             0 mingetty
> [40003.254342] [14153]     0 14153     1078      348       8       3
>   0             0 mingetty
> [40003.257443] [14154]     0 14154     2803      182      10       3
>   0         -1000 udevd
> [40003.260688] [14155]     0 14155     2803      182      10       3
>   0         -1000 udevd
> [40003.263782] [14157]     0 14157     1078      369       8       3
>   0             0 mingetty
> [40003.266895] Out of memory: Kill process 13533 (java) score 970 or
> sacrifice child
> [40003.269702] Killed process 13533 (java) total-vm:188941660kB,
> anon-rss:29710828kB, file-rss:110428kB
>
>