You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2015/02/26 20:09:04 UTC

[jira] [Resolved] (CASSANDRA-8870) Tombstone overwhelming issue aborts client queries

     [ https://issues.apache.org/jira/browse/CASSANDRA-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-8870.
---------------------------------------
    Resolution: Not a Problem

This is by design and means you have a data model that will fall over under its own weight soon.  Cassandra is protecting itself from being taken down with it.

You can adjust the thresholds in cassandra.yaml if you need some breathing room to fix your application.

{noformat}
# When executing a scan, within or across a partition, we need to keep the
# tombstones seen in memory so we can return them to the coordinator, which
# will use them to make sure other replicas also know about the deleted rows.
# With workloads that generate a lot of tombstones, this can cause performance
# problems and even exaust the server heap.
# (http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets)
# Adjust the thresholds here if you understand the dangers and want to
# scan more tombstones anyway.  These thresholds may also be adjusted at runtime
# using the StorageService mbean.
tombstone_warn_threshold: 1000
tombstone_failure_threshold: 100000
{noformat}

> Tombstone overwhelming issue aborts client queries
> --------------------------------------------------
>
>                 Key: CASSANDRA-8870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8870
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: cassandra 2.1.2 ubunbtu 12.04
>            Reporter: Jeff Liu
>
> We are getting client queries timeout issues on the clients who are trying to query data from cassandra cluster. 
> Nodetool status shows that all nodes are still up regardless.
> Logs from client side:
> {noformat}
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cass-chisel01.tgr01.iad02.testd.nestlabs.com/10.66.182.113:9042 (com.datastax.driver.core.TransportException: [cass-chisel01.tgr01.iad02.testd.nestlabs.com/10.66.182.113:9042] Connection has been closed))
>         at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
>         at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) ~[com.datastax.cassandra.cassandra-driver-core-2.1.3.jar:na]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_55]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_55]
>         at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
> {noformat}
> Logs from cassandra/system.log
> {noformat}
> ERROR [HintedHandoff:2] 2015-02-23 23:46:28,410 SliceQueryFilter.java:212 - Scanned over 100000 tombstones in system.hints; query aborted (see tombstone_failure_threshold)
> ERROR [HintedHandoff:2] 2015-02-23 23:46:28,417 CassandraDaemon.java:153 - Exception in thread Thread[HintedHandoff:2,1,main]
> org.apache.cassandra.db.filter.TombstoneOverwhelmingException: null
>         at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:214) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:107) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:81) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:310) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:60) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1858) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1666) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:385) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:344) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:94) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:555) ~[apache-cassandra-2.1.2.jar:2.1.2]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_55]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_55]
>         at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_55]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)