You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Dominic Letz (JIRA)" <ji...@apache.org> on 2014/12/30 03:04:13 UTC
[jira] [Created] (CASSANDRA-8547) Make
RangeTombstone.Tracker.isDeleted() faster
Dominic Letz created CASSANDRA-8547:
---------------------------------------
Summary: Make RangeTombstone.Tracker.isDeleted() faster
Key: CASSANDRA-8547
URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
Project: Cassandra
Issue Type: Improvement
Components: Core
Environment: 2.0.11
Reporter: Dominic Letz
Attachments: rangetombstone.tracker.txt
During compaction and repairs with many tombstones an exorbitant amount of time is spend in RangeTombstone.Tracker.isDeleted().
The amount of time spend there can be so big that compactions and repairs look "stalled" and the time remaining time estimated frozen at the same value for days.
Using visualvm I've been sample profiling the code during execution and both in Compaction as well as during repairs found this. (point in time backtraces attached)
Looking at the code the problem is obviously the linear scanning:
{code}
public boolean isDeleted(Column column)
{
for (RangeTombstone tombstone : ranges)
{
if (comparator.compare(column.name(), tombstone.min) >= 0
&& comparator.compare(column.name(), tombstone.max) <= 0
&& tombstone.maxTimestamp() >= column.timestamp())
{
return true;
}
}
return false;
}
{code}
I would like to propose to change this and instead use a sorted list (e.g. RangeTombstoneList) here instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)