You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Rick Gunderson <rg...@ca.ibm.com> on 2016/04/28 19:22:59 UTC
tombstone_failure_threshold being ignored?
We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The
replication factor per datacenter is 3. The Xmx setting on the Cassandra
JVMs is 4GB.
We have a workload that generates loots of tombstones and Cassandra goes
OOM in about 24 hours. We've adjusted the tombstone_failure_threshold down
to 25000 but we never see the TombstoneOverwhelmingException before the
nodes start going OOM.
The table operation that looks to be the culprit is a scan of partition
keys (i.e. we are scanning across narrow rows, not scanning within a wide
row). The heapdump shows we have a RangeSliceReply containing an ArrayList
with 1,823,230 org.apache.cassandra.db.Row objects with a retained heap
size of 441MiB. A look inside one of the Row objects shows an
org.apache.cassandra.db.DeletionInfo object so I assume that means the row
has been tombstoned.
If all of the 1,823,239 Row objects are tombstoned (and it is likely that
most of them are), is there a reason that the
TombstoneOverwhelmingException never gets thrown?
Regards,
Rick (R.) Gunderson
Software Engineer
IBM Commerce, B2B Development - GDHA
Phone: 1-250-220-1053
E-mail: rgunderson@ca.ibm.com
Find me on:
1803 Douglas St
Victoria, BC V8T 5C3
Canada
Re: tombstone_failure_threshold being ignored?
Posted by Rick Gunderson <rg...@ca.ibm.com>.
I would have thought that a RangeSliceReply (which is the parent object
that seems to "own" the ArrayList) would have contained only those objects
related to the corresponding query. The hierarchy of objects appears to
be:
org.apache.cassandra.net.OutboundTcpConnection$QueuedMessage
org.apache.cassandra.net.MessageOut
org.apache.cassandra.db.RangeSliceReply
java.util.ArrayList
java.lang.Object[1823230]
org.apache.cassandra.db.Row
....
So to me it looks like all the Row objects are related to one outbound
message (assuming my interpretation of the heap dump is correct).
And regarding the tombstone_warn_threshold (which is the default of 1000),
we never see those warnings in the logs either (with the exception of the
<file> and <fileNamePattern> settings, we are using the out of the box
logback.xml settings).
Oleksandr Petrov <ol...@gmail.com> wrote on 05/03/2016 01:21:20
AM:
> From: Oleksandr Petrov <ol...@gmail.com>
> To: user@cassandra.apache.org
> Date: 05/03/2016 01:21 AM
> Subject: Re: tombstone_failure_threshold being ignored?
>
> If I understand the problem correctly, tombstone_failure_theshold is
> never reached because the ~2M objects might have been collected for
> different queries running in parallel, not for one query. Every
> separate query never reached the threshold although all together
> they contributed to the OOM.
>
> You can read a bit more about the anti-patterns (particularly, ones
> related to workloads generating lots of tombstones): http://
> www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-
> like-datasets
>
> You can also try running more frequent repair/compacts. Although I'd
> look closer on the read queries first, possibly with tracing on, and
> check parallelism for those. Maybe decrease warn level for tombstone
> thresholds to understand where the bounds are.
>
> On Thu, Apr 28, 2016 at 7:23 PM Rick Gunderson <rg...@ca.ibm.com>
wrote:
> We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The
> replication factor per datacenter is 3. The Xmx setting on the
> Cassandra JVMs is 4GB.
>
> We have a workload that generates loots of tombstones and Cassandra
> goes OOM in about 24 hours. We've adjusted the
> tombstone_failure_threshold down to 25000 but we never see the
> TombstoneOverwhelmingException before the nodes start going OOM.
>
> The table operation that looks to be the culprit is a scan of
> partition keys (i.e. we are scanning across narrow rows, not
> scanning within a wide row). The heapdump shows we have a
> RangeSliceReply containing an ArrayList with 1,823,230
> org.apache.cassandra.db.Row objects with a retained heap size of
> 441MiB. A look inside one of the Row objects shows an
> org.apache.cassandra.db.DeletionInfo object so I assume that means
> the row has been tombstoned.
>
> If all of the 1,823,239 Row objects are tombstoned (and it is likely
> that most of them are), is there a reason that the
> TombstoneOverwhelmingException never gets thrown?
>
>
>
> Regards,
>
> Rick (R.) Gunderson
> Software Engineer
> IBM Commerce, B2B Development - GDHA
>
> Phone: 1-250-220-1053
> E-mail: rgunderson@ca.ibm.com
> Find me on:
>
>
>
> 1803 Douglas St
> Victoria, BC V8T 5C3
> Canada
>
>
> --
> Alex [attachment "attzt2ii.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att9c0x1.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "attygeo3.gif" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att7ryu3.jpg" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "attea19n.gif" deleted by Rick Gunderson/CanWest/
> IBM] [attachment "att95hh3.jpg" deleted by Rick Gunderson/CanWest/IBM]
Re: tombstone_failure_threshold being ignored?
Posted by Oleksandr Petrov <ol...@gmail.com>.
If I understand the problem correctly, tombstone_failure_theshold is never
reached because the ~2M objects might have been collected for different
queries running in parallel, not for one query. Every separate query never
reached the threshold although all together they contributed to the OOM.
You can read a bit more about the anti-patterns (particularly, ones related
to workloads generating lots of tombstones):
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
You can also try running more frequent repair/compacts. Although I'd look
closer on the read queries first, possibly with tracing on, and check
parallelism for those. Maybe decrease warn level for tombstone thresholds
to understand where the bounds are.
On Thu, Apr 28, 2016 at 7:23 PM Rick Gunderson <rg...@ca.ibm.com>
wrote:
> We are running Cassandra 2.2.3, 2 data centers, 3 nodes in each. The
> replication factor per datacenter is 3. The Xmx setting on the Cassandra
> JVMs is 4GB.
>
> We have a workload that generates loots of tombstones and Cassandra goes
> OOM in about 24 hours. We've adjusted the tombstone_failure_threshold down
> to 25000 but we never see the TombstoneOverwhelmingException before the
> nodes start going OOM.
>
> The table operation that looks to be the culprit is a scan of partition
> keys (i.e. we are scanning across narrow rows, not scanning within a wide
> row). The heapdump shows we have a RangeSliceReply containing an ArrayList
> with 1,823,230 org.apache.cassandra.db.Row objects with a retained heap
> size of 441MiB. A look inside one of the Row objects shows an
> org.apache.cassandra.db.DeletionInfo object so I assume that means the row
> has been tombstoned.
>
> If all of the 1,823,239 Row objects are tombstoned (and it is likely that
> most of them are), is there a reason that the
> TombstoneOverwhelmingException never gets thrown?
>
>
>
> Regards,
>
> *Rick (R.) Gunderson *
> Software Engineer
> IBM Commerce, B2B Development - GDHA
> ------------------------------
> [image: 2D barcode - encoded with contact information] *Phone: *1-250-220-1053
>
> *E-mail:* *rgunderson@ca.ibm.com* <rg...@ca.ibm.com>
> *Find me on:* [image: LinkedIn:
> http://ca.linkedin.com/pub/rick-gunderson/0/443/241]
> <http://ca.linkedin.com/pub/rick-gunderson/0/443/241>
> [image: IBM]
>
> 1803 Douglas St
> Victoria, BC V8T 5C3
> Canada
>
>
> --
Alex