You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ayub M <hi...@gmail.com> on 2019/05/01 06:13:06 UTC

Re: cassandra node was put down with oom error

Do you have search on the same nodes or is it only cassandra. In my case it
was due to a memory leak bug in dse search that consumed more memory
resulting in oom.

On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com <ye...@gmail.com>
wrote:

> Hello,
>
> I'm suffering from similar problem with OSS cassandra version3.11.3.
> My cassandra cluster have been running for longer than 1 years and there
> was no problem until this year.
> The cluster is write-intensive, consists of 70 nodes, and all rows have 2
> hr TTL.
> The only change is the read consistency from QUORUM to ONE. (I cannot
> revert this change because of the read latency)
> Below is my compaction strategy.
> ```
> compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2',
> 'unchecked_tombstone_compaction': 'false'}
> ```
> I've tried rolling restarting the cluster several times,
> but the memory usage of cassandra process always keeps going high.
> I also tried Native Memory Tracking, but it only measured less memory
> usage than the system mesaures (RSS in /proc/{cassandra-pid}/status)
>
> Is there any way that I could figure out the cause of this problem?
>
>
> On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > You’re running DSE so the OSS list may not be much help. Datastax May
> have more insight
> >
> > In open source, the only things offheap that vary significantly are
> bloom filters and compression offsets - both scale with disk space, and
> both increase during compaction. Large STCS compaction can cause pretty
> meaningful allocations for these. Also, if you have an unusually low
> compression chunk size or a very low bloom filter FP ratio, those will be
> larger.
> >
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > >
> > > Cassandra node went down due to OOM, and checking the /var/log/message
> I see below.
> > >
> > > ```
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked oom-killer:
> gfp_mask=0x280da, order=0, oom_score_adj=0
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/ mems_allowed=0
> > > ....
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB (U) 0*8kB
> 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
> 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32: 1294*4kB (UM)
> 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM) 114*128kB (UEM)
> 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal: 5319*4kB
> (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total pagecache pages
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap cache
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache stats: add 0,
> delete 0, find 0/0
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages HighMem/MovableOnly
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages reserved
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> total_vm      rss nr_ptes swapents oom_score_adj name
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> 41614      326      82        0             0 systemd-journal
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> 29793      541      27        0             0 lvmetad
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> 11892      762      25        0         -1000 systemd-udevd
> > > .....
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
>  459778    97729     429        0             0 Scan Factory
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> 21628     5340      24        0             0 macompatsvc
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> 2822449   260291     810        0             0 java
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> 17223585 14908485   32510        0             0 java
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
>  461828    97716     318        0             0 ScanAction Mgr
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
>  496653    98605     340        0             0 OAS Manager
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> 25474      486      48        0             0 pickup
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> 26999      191       9        0             0 iostat
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> 26999      192      10        0             0 iostat
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill process
> 8509 (java) score 928 or sacrifice child
> > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509 (java)
> total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB, shmem-rss:0kB
> > > ```
> > >
> > > Nothing else runs on this host except dse cassandra with search and
> monitoring agents. Max heap size is set to 31g, the cassandra java process
> seems to be using ~57gb (ram is 62gb) at the time of error.
> > > So I am guess the jvm started using lots of memory and triggered oom
> error.
> > > Is my understanding correct?
> > > That this is linux triggered jvm kill as the jvm was consuming more
> than available memory?
> > >
> > > So in this case jvm was using max of 31g and remaining 26gb its using
> is non-heap memory. Normally this process takes around 42g and the fact
> that at the time of oom moment it was consuming 57g I am suspecting the
> java process to be the culprit rather than victim.
> > >
> > > At the time of issue there was no heap dump taken, I have configured
> it now. But even if heap dump was taken would it have help figure out who
> is consuming more memory. Heapdump would only dump heap memory area, what
> should be used to dump non-heapdump? Native memory tracking is one thing I
> came across.
> > > Any way to have native memory dumped when oom occurs?
> > > Whats the best way to monitor the jvm memory to diagnose oom errors?
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: cassandra node was put down with oom error

Posted by Sandeep Nethi <ne...@gmail.com>.

I think 3.11.3 has some bug and which can cause OOMs on nodes with full
repairs. Just check if there is any correlation with ooms and repair
process.

Thanks,
Sandeep



On Wed, 1 May 2019 at 11:02 PM, Mia <ye...@gmail.com> wrote:

> Hi Sandeep.
>
> I'm not running any manual repair and I think there is no running full
> repair.
> I cannot see any log about repair in system.log these days.
> Does full repair have anything to do with using large amount of memory?
>
> Thanks.
>
> On 2019/05/01 10:47:50, Sandeep Nethi <ne...@gmail.com> wrote:
> > Are you by any chance running the full repair on these nodes?
> >
> > Thanks,
> > Sandeep
> >
> > On Wed, 1 May 2019 at 10:46 PM, Mia <ye...@gmail.com> wrote:
> >
> > > Hello, Ayub.
> > >
> > > I'm using apache cassandra, not dse edition. So I have never used the
> dse
> > > search feature.
> > > In my case, all the nodes of the cluster have the same problem.
> > >
> > > Thanks.
> > >
> > > On 2019/05/01 06:13:06, Ayub M <hi...@gmail.com> wrote:
> > > > Do you have search on the same nodes or is it only cassandra. In my
> case
> > > it
> > > > was due to a memory leak bug in dse search that consumed more memory
> > > > resulting in oom.
> > > >
> > > > On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com <
> yeomii999@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I'm suffering from similar problem with OSS cassandra
> version3.11.3.
> > > > > My cassandra cluster have been running for longer than 1 years and
> > > there
> > > > > was no problem until this year.
> > > > > The cluster is write-intensive, consists of 70 nodes, and all rows
> > > have 2
> > > > > hr TTL.
> > > > > The only change is the read consistency from QUORUM to ONE. (I
> cannot
> > > > > revert this change because of the read latency)
> > > > > Below is my compaction strategy.
> > > > > ```
> > > > > compaction = {'class':
> > > > > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> > > > > 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> > > > > 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> > > > > 'tombstone_compaction_interval': '60', 'tombstone_threshold':
> '0.2',
> > > > > 'unchecked_tombstone_compaction': 'false'}
> > > > > ```
> > > > > I've tried rolling restarting the cluster several times,
> > > > > but the memory usage of cassandra process always keeps going high.
> > > > > I also tried Native Memory Tracking, but it only measured less
> memory
> > > > > usage than the system mesaures (RSS in
> /proc/{cassandra-pid}/status)
> > > > >
> > > > > Is there any way that I could figure out the cause of this problem?
> > > > >
> > > > >
> > > > > On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > > > > > You’re running DSE so the OSS list may not be much help.
> Datastax May
> > > > > have more insight
> > > > > >
> > > > > > In open source, the only things offheap that vary significantly
> are
> > > > > bloom filters and compression offsets - both scale with disk
> space, and
> > > > > both increase during compaction. Large STCS compaction can cause
> pretty
> > > > > meaningful allocations for these. Also, if you have an unusually
> low
> > > > > compression chunk size or a very low bloom filter FP ratio, those
> will
> > > be
> > > > > larger.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Jeff Jirsa
> > > > > >
> > > > > >
> > > > > > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > > > > > >
> > > > > > > Cassandra node went down due to OOM, and checking the
> > > /var/log/message
> > > > > I see below.
> > > > > > >
> > > > > > > ```
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked
> oom-killer:
> > > > > gfp_mask=0x280da, order=0, oom_score_adj=0
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/
> > > mems_allowed=0
> > > > > > > ....
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB
> (U)
> > > 0*8kB
> > > > > 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB
> 1*1024kB
> > > (U)
> > > > > 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32:
> 1294*4kB
> > > (UM)
> > > > > 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM)
> 114*128kB
> > > (UEM)
> > > > > 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > > > > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal:
> 5319*4kB
> > > > > (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB
> 0*512kB
> > > > > 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0
> hugepages_total=0
> > > > > hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0
> hugepages_total=0
> > > > > hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total
> pagecache
> > > pages
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap
> cache
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache stats:
> add 0,
> > > > > delete 0, find 0/0
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages
> > > HighMem/MovableOnly
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages
> reserved
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> > > > > total_vm      rss nr_ptes swapents oom_score_adj name
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> > > > > 41614      326      82        0             0 systemd-journal
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> > > > > 29793      541      27        0             0 lvmetad
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> > > > > 11892      762      25        0         -1000 systemd-udevd
> > > > > > > .....
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
> > > > >  459778    97729     429        0             0 Scan Factory
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> > > > > 21628     5340      24        0             0 macompatsvc
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> > > > > 2822449   260291     810        0             0 java
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> > > > > 17223585 14908485   32510        0             0 java
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
> > > > >  461828    97716     318        0             0 ScanAction Mgr
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
> > > > >  496653    98605     340        0             0 OAS Manager
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> > > > > 25474      486      48        0             0 pickup
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> > > > > 26999      191       9        0             0 iostat
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> > > > > 26999      192      10        0             0 iostat
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill
> > > process
> > > > > 8509 (java) score 928 or sacrifice child
> > > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509
> > > (java)
> > > > > total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB,
> > > shmem-rss:0kB
> > > > > > > ```
> > > > > > >
> > > > > > > Nothing else runs on this host except dse cassandra with
> search and
> > > > > monitoring agents. Max heap size is set to 31g, the cassandra java
> > > process
> > > > > seems to be using ~57gb (ram is 62gb) at the time of error.
> > > > > > > So I am guess the jvm started using lots of memory and
> triggered
> > > oom
> > > > > error.
> > > > > > > Is my understanding correct?
> > > > > > > That this is linux triggered jvm kill as the jvm was consuming
> more
> > > > > than available memory?
> > > > > > >
> > > > > > > So in this case jvm was using max of 31g and remaining 26gb its
> > > using
> > > > > is non-heap memory. Normally this process takes around 42g and the
> fact
> > > > > that at the time of oom moment it was consuming 57g I am
> suspecting the
> > > > > java process to be the culprit rather than victim.
> > > > > > >
> > > > > > > At the time of issue there was no heap dump taken, I have
> > > configured
> > > > > it now. But even if heap dump was taken would it have help figure
> out
> > > who
> > > > > is consuming more memory. Heapdump would only dump heap memory
> area,
> > > what
> > > > > should be used to dump non-heapdump? Native memory tracking is one
> > > thing I
> > > > > came across.
> > > > > > > Any way to have native memory dumped when oom occurs?
> > > > > > > Whats the best way to monitor the jvm memory to diagnose oom
> > > errors?
> > > > > > >
> > > > > >
> > > > > >
> ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > > > >
> > > > > >
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > > >
> > > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: user-help@cassandra.apache.org
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: cassandra node was put down with oom error

Posted by Mia <ye...@gmail.com>.

Hi Sandeep.

I'm not running any manual repair and I think there is no running full repair.
I cannot see any log about repair in system.log these days.
Does full repair have anything to do with using large amount of memory?

Thanks.

On 2019/05/01 10:47:50, Sandeep Nethi <ne...@gmail.com> wrote: 
> Are you by any chance running the full repair on these nodes?
> 
> Thanks,
> Sandeep
> 
> On Wed, 1 May 2019 at 10:46 PM, Mia <ye...@gmail.com> wrote:
> 
> > Hello, Ayub.
> >
> > I'm using apache cassandra, not dse edition. So I have never used the dse
> > search feature.
> > In my case, all the nodes of the cluster have the same problem.
> >
> > Thanks.
> >
> > On 2019/05/01 06:13:06, Ayub M <hi...@gmail.com> wrote:
> > > Do you have search on the same nodes or is it only cassandra. In my case
> > it
> > > was due to a memory leak bug in dse search that consumed more memory
> > > resulting in oom.
> > >
> > > On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com <ye...@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm suffering from similar problem with OSS cassandra version3.11.3.
> > > > My cassandra cluster have been running for longer than 1 years and
> > there
> > > > was no problem until this year.
> > > > The cluster is write-intensive, consists of 70 nodes, and all rows
> > have 2
> > > > hr TTL.
> > > > The only change is the read consistency from QUORUM to ONE. (I cannot
> > > > revert this change because of the read latency)
> > > > Below is my compaction strategy.
> > > > ```
> > > > compaction = {'class':
> > > > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> > > > 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> > > > 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> > > > 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2',
> > > > 'unchecked_tombstone_compaction': 'false'}
> > > > ```
> > > > I've tried rolling restarting the cluster several times,
> > > > but the memory usage of cassandra process always keeps going high.
> > > > I also tried Native Memory Tracking, but it only measured less memory
> > > > usage than the system mesaures (RSS in /proc/{cassandra-pid}/status)
> > > >
> > > > Is there any way that I could figure out the cause of this problem?
> > > >
> > > >
> > > > On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > > > > You’re running DSE so the OSS list may not be much help. Datastax May
> > > > have more insight
> > > > >
> > > > > In open source, the only things offheap that vary significantly are
> > > > bloom filters and compression offsets - both scale with disk space, and
> > > > both increase during compaction. Large STCS compaction can cause pretty
> > > > meaningful allocations for these. Also, if you have an unusually low
> > > > compression chunk size or a very low bloom filter FP ratio, those will
> > be
> > > > larger.
> > > > >
> > > > >
> > > > > --
> > > > > Jeff Jirsa
> > > > >
> > > > >
> > > > > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > > > > >
> > > > > > Cassandra node went down due to OOM, and checking the
> > /var/log/message
> > > > I see below.
> > > > > >
> > > > > > ```
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked oom-killer:
> > > > gfp_mask=0x280da, order=0, oom_score_adj=0
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/
> > mems_allowed=0
> > > > > > ....
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB (U)
> > 0*8kB
> > > > 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB
> > (U)
> > > > 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32: 1294*4kB
> > (UM)
> > > > 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM) 114*128kB
> > (UEM)
> > > > 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > > > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal: 5319*4kB
> > > > (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> > > > 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > > > hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > > > hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total pagecache
> > pages
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap cache
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache stats: add 0,
> > > > delete 0, find 0/0
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages
> > HighMem/MovableOnly
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages reserved
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> > > > total_vm      rss nr_ptes swapents oom_score_adj name
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> > > > 41614      326      82        0             0 systemd-journal
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> > > > 29793      541      27        0             0 lvmetad
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> > > > 11892      762      25        0         -1000 systemd-udevd
> > > > > > .....
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
> > > >  459778    97729     429        0             0 Scan Factory
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> > > > 21628     5340      24        0             0 macompatsvc
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> > > > 2822449   260291     810        0             0 java
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> > > > 17223585 14908485   32510        0             0 java
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
> > > >  461828    97716     318        0             0 ScanAction Mgr
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
> > > >  496653    98605     340        0             0 OAS Manager
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> > > > 25474      486      48        0             0 pickup
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> > > > 26999      191       9        0             0 iostat
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> > > > 26999      192      10        0             0 iostat
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill
> > process
> > > > 8509 (java) score 928 or sacrifice child
> > > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509
> > (java)
> > > > total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB,
> > shmem-rss:0kB
> > > > > > ```
> > > > > >
> > > > > > Nothing else runs on this host except dse cassandra with search and
> > > > monitoring agents. Max heap size is set to 31g, the cassandra java
> > process
> > > > seems to be using ~57gb (ram is 62gb) at the time of error.
> > > > > > So I am guess the jvm started using lots of memory and triggered
> > oom
> > > > error.
> > > > > > Is my understanding correct?
> > > > > > That this is linux triggered jvm kill as the jvm was consuming more
> > > > than available memory?
> > > > > >
> > > > > > So in this case jvm was using max of 31g and remaining 26gb its
> > using
> > > > is non-heap memory. Normally this process takes around 42g and the fact
> > > > that at the time of oom moment it was consuming 57g I am suspecting the
> > > > java process to be the culprit rather than victim.
> > > > > >
> > > > > > At the time of issue there was no heap dump taken, I have
> > configured
> > > > it now. But even if heap dump was taken would it have help figure out
> > who
> > > > is consuming more memory. Heapdump would only dump heap memory area,
> > what
> > > > should be used to dump non-heapdump? Native memory tracking is one
> > thing I
> > > > came across.
> > > > > > Any way to have native memory dumped when oom occurs?
> > > > > > Whats the best way to monitor the jvm memory to diagnose oom
> > errors?
> > > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > > >
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: cassandra node was put down with oom error

Posted by Sandeep Nethi <ne...@gmail.com>.

Are you by any chance running the full repair on these nodes?

Thanks,
Sandeep

On Wed, 1 May 2019 at 10:46 PM, Mia <ye...@gmail.com> wrote:

> Hello, Ayub.
>
> I'm using apache cassandra, not dse edition. So I have never used the dse
> search feature.
> In my case, all the nodes of the cluster have the same problem.
>
> Thanks.
>
> On 2019/05/01 06:13:06, Ayub M <hi...@gmail.com> wrote:
> > Do you have search on the same nodes or is it only cassandra. In my case
> it
> > was due to a memory leak bug in dse search that consumed more memory
> > resulting in oom.
> >
> > On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com <ye...@gmail.com>
> > wrote:
> >
> > > Hello,
> > >
> > > I'm suffering from similar problem with OSS cassandra version3.11.3.
> > > My cassandra cluster have been running for longer than 1 years and
> there
> > > was no problem until this year.
> > > The cluster is write-intensive, consists of 70 nodes, and all rows
> have 2
> > > hr TTL.
> > > The only change is the read consistency from QUORUM to ONE. (I cannot
> > > revert this change because of the read latency)
> > > Below is my compaction strategy.
> > > ```
> > > compaction = {'class':
> > > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> > > 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> > > 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> > > 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2',
> > > 'unchecked_tombstone_compaction': 'false'}
> > > ```
> > > I've tried rolling restarting the cluster several times,
> > > but the memory usage of cassandra process always keeps going high.
> > > I also tried Native Memory Tracking, but it only measured less memory
> > > usage than the system mesaures (RSS in /proc/{cassandra-pid}/status)
> > >
> > > Is there any way that I could figure out the cause of this problem?
> > >
> > >
> > > On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > > > You’re running DSE so the OSS list may not be much help. Datastax May
> > > have more insight
> > > >
> > > > In open source, the only things offheap that vary significantly are
> > > bloom filters and compression offsets - both scale with disk space, and
> > > both increase during compaction. Large STCS compaction can cause pretty
> > > meaningful allocations for these. Also, if you have an unusually low
> > > compression chunk size or a very low bloom filter FP ratio, those will
> be
> > > larger.
> > > >
> > > >
> > > > --
> > > > Jeff Jirsa
> > > >
> > > >
> > > > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > > > >
> > > > > Cassandra node went down due to OOM, and checking the
> /var/log/message
> > > I see below.
> > > > >
> > > > > ```
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked oom-killer:
> > > gfp_mask=0x280da, order=0, oom_score_adj=0
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/
> mems_allowed=0
> > > > > ....
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB (U)
> 0*8kB
> > > 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB
> (U)
> > > 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32: 1294*4kB
> (UM)
> > > 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM) 114*128kB
> (UEM)
> > > 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal: 5319*4kB
> > > (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> > > 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > > hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > > hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total pagecache
> pages
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap cache
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache stats: add 0,
> > > delete 0, find 0/0
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages
> HighMem/MovableOnly
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages reserved
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> > > total_vm      rss nr_ptes swapents oom_score_adj name
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> > > 41614      326      82        0             0 systemd-journal
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> > > 29793      541      27        0             0 lvmetad
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> > > 11892      762      25        0         -1000 systemd-udevd
> > > > > .....
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
> > >  459778    97729     429        0             0 Scan Factory
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> > > 21628     5340      24        0             0 macompatsvc
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> > > 2822449   260291     810        0             0 java
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> > > 17223585 14908485   32510        0             0 java
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
> > >  461828    97716     318        0             0 ScanAction Mgr
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
> > >  496653    98605     340        0             0 OAS Manager
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> > > 25474      486      48        0             0 pickup
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> > > 26999      191       9        0             0 iostat
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> > > 26999      192      10        0             0 iostat
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill
> process
> > > 8509 (java) score 928 or sacrifice child
> > > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509
> (java)
> > > total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB,
> shmem-rss:0kB
> > > > > ```
> > > > >
> > > > > Nothing else runs on this host except dse cassandra with search and
> > > monitoring agents. Max heap size is set to 31g, the cassandra java
> process
> > > seems to be using ~57gb (ram is 62gb) at the time of error.
> > > > > So I am guess the jvm started using lots of memory and triggered
> oom
> > > error.
> > > > > Is my understanding correct?
> > > > > That this is linux triggered jvm kill as the jvm was consuming more
> > > than available memory?
> > > > >
> > > > > So in this case jvm was using max of 31g and remaining 26gb its
> using
> > > is non-heap memory. Normally this process takes around 42g and the fact
> > > that at the time of oom moment it was consuming 57g I am suspecting the
> > > java process to be the culprit rather than victim.
> > > > >
> > > > > At the time of issue there was no heap dump taken, I have
> configured
> > > it now. But even if heap dump was taken would it have help figure out
> who
> > > is consuming more memory. Heapdump would only dump heap memory area,
> what
> > > should be used to dump non-heapdump? Native memory tracking is one
> thing I
> > > came across.
> > > > > Any way to have native memory dumped when oom occurs?
> > > > > Whats the best way to monitor the jvm memory to diagnose oom
> errors?
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > > For additional commands, e-mail: user-help@cassandra.apache.org
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: user-help@cassandra.apache.org
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

RE: cassandra node was put down with oom error

Posted by "ZAIDI, ASAD A" <az...@att.com>.

Is there any chance partition size has grown over time and taking much allocated memory - if yes,  that could also affect compaction thread as they'll too take more heap and kept in heap longer - leaving less for other processes . You can check partition size if they are manageable using nodetool tablestats - ideally size  should be even across nodes. you can check if # of concurrent compactor (nodetool) are optimal and if throughput is capped/ throttled (with nodetool utility). See if repair is unusually  running longer  taking much resources  i.e. cpu/heap  etc.  check if storage is not acting up (using iostat -x , look at await column). See if there is bursty workload /batches are hitting nodes tipping over the instance  using nodetool tpstats  (look at native-transport-requests - all time blocked column) .. above should give some clue what is going around



-----Original Message-----
From: Mia [mailto:yeomii999@gmail.com] 
Sent: Wednesday, May 01, 2019 5:47 AM
To: user@cassandra.apache.org
Subject: Re: cassandra node was put down with oom error

Hello, Ayub.

I'm using apache cassandra, not dse edition. So I have never used the dse search feature.
In my case, all the nodes of the cluster have the same problem. 

Thanks.

On 2019/05/01 06:13:06, Ayub M <hi...@gmail.com> wrote: 
> Do you have search on the same nodes or is it only cassandra. In my 
> case it was due to a memory leak bug in dse search that consumed more 
> memory resulting in oom.
> 
> On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com 
> <ye...@gmail.com>
> wrote:
> 
> > Hello,
> >
> > I'm suffering from similar problem with OSS cassandra version3.11.3.
> > My cassandra cluster have been running for longer than 1 years and 
> > there was no problem until this year.
> > The cluster is write-intensive, consists of 70 nodes, and all rows 
> > have 2 hr TTL.
> > The only change is the read consistency from QUORUM to ONE. (I 
> > cannot revert this change because of the read latency) Below is my 
> > compaction strategy.
> > ```
> > compaction = {'class':
> > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> > 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> > 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> > 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2',
> > 'unchecked_tombstone_compaction': 'false'} ``` I've tried rolling 
> > restarting the cluster several times, but the memory usage of 
> > cassandra process always keeps going high.
> > I also tried Native Memory Tracking, but it only measured less 
> > memory usage than the system mesaures (RSS in 
> > /proc/{cassandra-pid}/status)
> >
> > Is there any way that I could figure out the cause of this problem?
> >
> >
> > On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > > You’re running DSE so the OSS list may not be much help. Datastax 
> > > May
> > have more insight
> > >
> > > In open source, the only things offheap that vary significantly 
> > > are
> > bloom filters and compression offsets - both scale with disk space, 
> > and both increase during compaction. Large STCS compaction can cause 
> > pretty meaningful allocations for these. Also, if you have an 
> > unusually low compression chunk size or a very low bloom filter FP 
> > ratio, those will be larger.
> > >
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > > >
> > > > Cassandra node went down due to OOM, and checking the 
> > > > /var/log/message
> > I see below.
> > > >
> > > > ```
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked oom-killer:
> > gfp_mask=0x280da, order=0, oom_score_adj=0
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/ 
> > > > mems_allowed=0 ....
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB (U) 
> > > > 0*8kB
> > 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 
> > 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32: 
> > > > 1294*4kB (UM)
> > 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM) 114*128kB 
> > (UEM) 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB Jan 23 20:07:17 
> > > > ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal: 5319*4kB
> > (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 
> > 0*512kB 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 
> > > > hugepages_total=0
> > hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 
> > > > hugepages_total=0
> > hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total pagecache 
> > > > pages Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap 
> > > > cache Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache 
> > > > stats: add 0,
> > delete 0, find 0/0
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB Jan 
> > > > 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB Jan 23 
> > > > 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM Jan 23 
> > > > 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages HighMem/MovableOnly 
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages reserved
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> > total_vm      rss nr_ptes swapents oom_score_adj name
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> > 41614      326      82        0             0 systemd-journal
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> > 29793      541      27        0             0 lvmetad
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> > 11892      762      25        0         -1000 systemd-udevd
> > > > .....
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
> >  459778    97729     429        0             0 Scan Factory
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> > 2822449   260291     810        0             0 java
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> > 17223585 14908485   32510        0             0 java
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
> >  461828    97716     318        0             0 ScanAction Mgr
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
> >  496653    98605     340        0             0 OAS Manager
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> > 25474      486      48        0             0 pickup
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> > 26999      191       9        0             0 iostat
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> > 26999      192      10        0             0 iostat
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill 
> > > > process
> > 8509 (java) score 928 or sacrifice child
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509 
> > > > (java)
> > total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB, 
> > shmem-rss:0kB
> > > > ```
> > > >
> > > > Nothing else runs on this host except dse cassandra with search 
> > > > and
> > monitoring agents. Max heap size is set to 31g, the cassandra java 
> > process seems to be using ~57gb (ram is 62gb) at the time of error.
> > > > So I am guess the jvm started using lots of memory and triggered 
> > > > oom
> > error.
> > > > Is my understanding correct?
> > > > That this is linux triggered jvm kill as the jvm was consuming 
> > > > more
> > than available memory?
> > > >
> > > > So in this case jvm was using max of 31g and remaining 26gb its 
> > > > using
> > is non-heap memory. Normally this process takes around 42g and the 
> > fact that at the time of oom moment it was consuming 57g I am 
> > suspecting the java process to be the culprit rather than victim.
> > > >
> > > > At the time of issue there was no heap dump taken, I have 
> > > > configured
> > it now. But even if heap dump was taken would it have help figure 
> > out who is consuming more memory. Heapdump would only dump heap 
> > memory area, what should be used to dump non-heapdump? Native memory 
> > tracking is one thing I came across.
> > > > Any way to have native memory dumped when oom occurs?
> > > > Whats the best way to monitor the jvm memory to diagnose oom errors?
> > > >
> > >
> > > ------------------------------------------------------------------
> > > --- To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: user-help@cassandra.apache.org
> > >
> > >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: cassandra node was put down with oom error

Posted by Mia <ye...@gmail.com>.

Hello, Ayub.

I'm using apache cassandra, not dse edition. So I have never used the dse search feature.
In my case, all the nodes of the cluster have the same problem. 

Thanks.

On 2019/05/01 06:13:06, Ayub M <hi...@gmail.com> wrote: 
> Do you have search on the same nodes or is it only cassandra. In my case it
> was due to a memory leak bug in dse search that consumed more memory
> resulting in oom.
> 
> On Tue, Apr 30, 2019, 2:58 AM yeomii999@gmail.com <ye...@gmail.com>
> wrote:
> 
> > Hello,
> >
> > I'm suffering from similar problem with OSS cassandra version3.11.3.
> > My cassandra cluster have been running for longer than 1 years and there
> > was no problem until this year.
> > The cluster is write-intensive, consists of 70 nodes, and all rows have 2
> > hr TTL.
> > The only change is the read consistency from QUORUM to ONE. (I cannot
> > revert this change because of the read latency)
> > Below is my compaction strategy.
> > ```
> > compaction = {'class':
> > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> > 'compaction_window_size': '3', 'compaction_window_unit': 'MINUTES',
> > 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4',
> > 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2',
> > 'unchecked_tombstone_compaction': 'false'}
> > ```
> > I've tried rolling restarting the cluster several times,
> > but the memory usage of cassandra process always keeps going high.
> > I also tried Native Memory Tracking, but it only measured less memory
> > usage than the system mesaures (RSS in /proc/{cassandra-pid}/status)
> >
> > Is there any way that I could figure out the cause of this problem?
> >
> >
> > On 2019/01/26 20:53:26, Jeff Jirsa <jj...@gmail.com> wrote:
> > > You’re running DSE so the OSS list may not be much help. Datastax May
> > have more insight
> > >
> > > In open source, the only things offheap that vary significantly are
> > bloom filters and compression offsets - both scale with disk space, and
> > both increase during compaction. Large STCS compaction can cause pretty
> > meaningful allocations for these. Also, if you have an unusually low
> > compression chunk size or a very low bloom filter FP ratio, those will be
> > larger.
> > >
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Jan 26, 2019, at 12:11 PM, Ayub M <hi...@gmail.com> wrote:
> > > >
> > > > Cassandra node went down due to OOM, and checking the /var/log/message
> > I see below.
> > > >
> > > > ```
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java invoked oom-killer:
> > gfp_mask=0x280da, order=0, oom_score_adj=0
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: java cpuset=/ mems_allowed=0
> > > > ....
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA: 1*4kB (U) 0*8kB
> > 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
> > 1*2048kB (M) 3*4096kB (M) = 15908kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 DMA32: 1294*4kB (UM)
> > 932*8kB (UEM) 897*16kB (UEM) 483*32kB (UEM) 224*64kB (UEM) 114*128kB (UEM)
> > 41*256kB (UEM) 12*512kB (UEM) 7*1024kB (UE
> > > > M) 2*2048kB (EM) 35*4096kB (UM) = 242632kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 Normal: 5319*4kB
> > (UE) 3233*8kB (UEM) 960*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> > 0*1024kB 0*2048kB 0*4096kB = 62500kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Node 0 hugepages_total=0
> > hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 38109 total pagecache pages
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages in swap cache
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Swap cache stats: add 0,
> > delete 0, find 0/0
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Free swap  = 0kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Total swap = 0kB
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 16394647 pages RAM
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 0 pages HighMem/MovableOnly
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: 310559 pages reserved
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ pid ]   uid  tgid
> > total_vm      rss nr_ptes swapents oom_score_adj name
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2634]     0  2634
> > 41614      326      82        0             0 systemd-journal
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2690]     0  2690
> > 29793      541      27        0             0 lvmetad
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 2710]     0  2710
> > 11892      762      25        0         -1000 systemd-udevd
> > > > .....
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [13774]     0 13774
> >  459778    97729     429        0             0 Scan Factory
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14506]     0 14506
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14586]     0 14586
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14588]     0 14588
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14589]     0 14589
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14598]     0 14598
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14599]     0 14599
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14600]     0 14600
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [14601]     0 14601
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19679]     0 19679
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [19680]     0 19680
> > 21628     5340      24        0             0 macompatsvc
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 9084]  1007  9084
> > 2822449   260291     810        0             0 java
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 8509]  1007  8509
> > 17223585 14908485   32510        0             0 java
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21877]     0 21877
> >  461828    97716     318        0             0 ScanAction Mgr
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [21884]     0 21884
> >  496653    98605     340        0             0 OAS Manager
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [31718]    89 31718
> > 25474      486      48        0             0 pickup
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4891]  1007  4891
> > 26999      191       9        0             0 iostat
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: [ 4957]  1007  4957
> > 26999      192      10        0             0 iostat
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Out of memory: Kill process
> > 8509 (java) score 928 or sacrifice child
> > > > Jan 23 20:07:17 ip-xxx-xxx-xxx-xxx kernel: Killed process 8509 (java)
> > total-vm:68894340kB, anon-rss:59496344kB, file-rss:137596kB, shmem-rss:0kB
> > > > ```
> > > >
> > > > Nothing else runs on this host except dse cassandra with search and
> > monitoring agents. Max heap size is set to 31g, the cassandra java process
> > seems to be using ~57gb (ram is 62gb) at the time of error.
> > > > So I am guess the jvm started using lots of memory and triggered oom
> > error.
> > > > Is my understanding correct?
> > > > That this is linux triggered jvm kill as the jvm was consuming more
> > than available memory?
> > > >
> > > > So in this case jvm was using max of 31g and remaining 26gb its using
> > is non-heap memory. Normally this process takes around 42g and the fact
> > that at the time of oom moment it was consuming 57g I am suspecting the
> > java process to be the culprit rather than victim.
> > > >
> > > > At the time of issue there was no heap dump taken, I have configured
> > it now. But even if heap dump was taken would it have help figure out who
> > is consuming more memory. Heapdump would only dump heap memory area, what
> > should be used to dump non-heapdump? Native memory tracking is one thing I
> > came across.
> > > > Any way to have native memory dumped when oom occurs?
> > > > Whats the best way to monitor the jvm memory to diagnose oom errors?
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: user-help@cassandra.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org