You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Rick Gunderson <rg...@ca.ibm.com> on 2016/04/28 19:22:59 UTC

tombstone_failure_threshold being ignored?

We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The 
replication factor per datacenter is 3. The Xmx setting on the Cassandra 
JVMs is 4GB.

We have a workload that generates loots of tombstones and Cassandra goes 
OOM in about 24 hours. We've adjusted the tombstone_failure_threshold down 
to 25000 but we never see the TombstoneOverwhelmingException before the 
nodes start going OOM.

The table operation that looks to be the culprit is a scan of partition 
keys (i.e. we are scanning across narrow rows, not scanning within a wide 
row). The heapdump shows we have a RangeSliceReply containing an ArrayList 
with 1,823,230 org.apache.cassandra.db.Row objects with a retained heap 
size of 441MiB.  A look inside one of the Row objects shows an 
org.apache.cassandra.db.DeletionInfo object so I assume that means the row 
has been tombstoned.

If all of the 1,823,239 Row objects are tombstoned (and it is likely that 
most of them are), is there a reason that the 
TombstoneOverwhelmingException never gets thrown? 



Regards,

Rick (R.) Gunderson 
Software Engineer
IBM Commerce, B2B Development - GDHA


Phone: 1-250-220-1053 
E-mail: rgunderson@ca.ibm.com 
Find me on:  


1803 Douglas St
Victoria, BC V8T 5C3 
Canada

Re: tombstone_failure_threshold being ignored?

Posted by Rick Gunderson <rg...@ca.ibm.com>.

I would have thought that a RangeSliceReply (which is the parent object 
that seems to "own" the ArrayList) would have contained only those objects 
related to the corresponding query. The hierarchy of objects appears to 
be:

org.apache.cassandra.net.OutboundTcpConnection$QueuedMessage
    org.apache.cassandra.net.MessageOut
        org.apache.cassandra.db.RangeSliceReply
            java.util.ArrayList
                java.lang.Object[1823230]
                    org.apache.cassandra.db.Row
                    ....

So to me it looks like all the Row objects are related to one outbound 
message (assuming my interpretation of the heap dump is correct).

And regarding the tombstone_warn_threshold (which is the default of 1000), 
we never see those warnings in the logs either (with the exception of the 
<file> and <fileNamePattern> settings, we are using the out of the box 
logback.xml settings).

Oleksandr Petrov <ol...@gmail.com> wrote on 05/03/2016 01:21:20 
AM:

> From: Oleksandr Petrov <ol...@gmail.com>
> To: user@cassandra.apache.org
> Date: 05/03/2016 01:21 AM
> Subject: Re: tombstone_failure_threshold being ignored?
> 
> If I understand the problem correctly, tombstone_failure_theshold is
> never reached because the ~2M objects might have been collected for 
> different queries running in parallel, not for one query. Every 
> separate query never reached the threshold although all together 
> they contributed to the OOM.
> 
> You can read a bit more about the anti-patterns (particularly, ones 
> related to workloads generating lots of tombstones): http://
> www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-
> like-datasets
> 
> You can also try running more frequent repair/compacts. Although I'd
> look closer on the read queries first, possibly with tracing on, and
> check parallelism for those. Maybe decrease warn level for tombstone
> thresholds to understand where the bounds are.
> 
> On Thu, Apr 28, 2016 at 7:23 PM Rick Gunderson <rg...@ca.ibm.com> 
wrote:
> We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The
> replication factor per datacenter is 3. The Xmx setting on the 
> Cassandra JVMs is 4GB.
> 
> We have a workload that generates loots of tombstones and Cassandra 
> goes OOM in about 24 hours. We've adjusted the 
> tombstone_failure_threshold down to 25000 but we never see the 
> TombstoneOverwhelmingException before the nodes start going OOM.
> 
> The table operation that looks to be the culprit is a scan of 
> partition keys (i.e. we are scanning across narrow rows, not 
> scanning within a wide row). The heapdump shows we have a 
> RangeSliceReply containing an ArrayList with 1,823,230 
> org.apache.cassandra.db.Row objects with a retained heap size of 
> 441MiB.  A look inside one of the Row objects shows an 
> org.apache.cassandra.db.DeletionInfo object so I assume that means 
> the row has been tombstoned.
> 
> If all of the 1,823,239 Row objects are tombstoned (and it is likely
> that most of them are), is there a reason that the 
> TombstoneOverwhelmingException never gets thrown? 
> 
> 
> 
> Regards,
> 
> Rick (R.) Gunderson 
> Software Engineer
> IBM Commerce, B2B Development - GDHA
> 
> Phone: 1-250-220-1053 
> E-mail: rgunderson@ca.ibm.com
> Find me on:  
> 
> 
> 
> 1803 Douglas St
> Victoria, BC V8T 5C3 
> Canada 
> 
> 

> -- 
> Alex [attachment "attzt2ii.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att9c0x1.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "attygeo3.gif" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att7ryu3.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "attea19n.gif" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att95hh3.jpg" deleted by Rick Gunderson/CanWest/IBM]

Re: tombstone_failure_threshold being ignored?

Posted by Oleksandr Petrov <ol...@gmail.com>.

If I understand the problem correctly, tombstone_failure_theshold is never
reached because the ~2M objects might have been collected for different
queries running in parallel, not for one query. Every separate query never
reached the threshold although all together they contributed to the OOM.

You can read a bit more about the anti-patterns (particularly, ones related
to workloads generating lots of tombstones):
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets

You can also try running more frequent repair/compacts. Although I'd look
closer on the read queries first, possibly with tracing on, and check
parallelism for those. Maybe decrease warn level for tombstone thresholds
to understand where the bounds are.

On Thu, Apr 28, 2016 at 7:23 PM Rick Gunderson <rg...@ca.ibm.com>
wrote:

> We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The
> replication factor per datacenter is 3. The Xmx setting on the Cassandra
> JVMs is 4GB.
>
> We have a workload that generates loots of tombstones and Cassandra goes
> OOM in about 24 hours. We've adjusted the tombstone_failure_threshold down
> to 25000 but we never see the TombstoneOverwhelmingException before the
> nodes start going OOM.
>
> The table operation that looks to be the culprit is a scan of partition
> keys (i.e. we are scanning across narrow rows, not scanning within a wide
> row). The heapdump shows we have a RangeSliceReply containing an ArrayList
> with 1,823,230 org.apache.cassandra.db.Row objects with a retained heap
> size of 441MiB.  A look inside one of the Row objects shows an
> org.apache.cassandra.db.DeletionInfo object so I assume that means the row
> has been tombstoned.
>
> If all of the 1,823,239 Row objects are tombstoned (and it is likely that
> most of them are), is there a reason that the
> TombstoneOverwhelmingException never gets thrown?
>
>
>
> Regards,
>
> *Rick (R.) Gunderson *
> Software Engineer
> IBM Commerce, B2B Development - GDHA
> ------------------------------
> [image: 2D barcode - encoded with contact information] *Phone: *1-250-220-1053
>
> *E-mail:* *rgunderson@ca.ibm.com* <rg...@ca.ibm.com>
> *Find me on:* [image: LinkedIn:
> http://ca.linkedin.com/pub/rick-gunderson/0/443/241]
> <http://ca.linkedin.com/pub/rick-gunderson/0/443/241>
> [image: IBM]
>
> 1803 Douglas St
> Victoria, BC V8T 5C3
> Canada
>
>
> --
Alex