You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Corentin Chary (JIRA)" <ji...@apache.org> on 2016/12/14 09:20:58 UTC
[jira] [Comment Edited] (CASSANDRA-13038) 33% of compaction time spent in StreamingHistogram.update()

    [ https://issues.apache.org/jira/browse/CASSANDRA-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747730#comment-15747730 ] 

Corentin Chary edited comment on CASSANDRA-13038 at 12/14/16 9:20 AM:
----------------------------------------------------------------------

looking at the compaction code, this would probably not work for strategies that aren't TWCS or DTCS because they relay on worthDroppingTombstones(). For TWCS and DTCS this means that only fully expired SSTables would be deleted (which probably is a good thing but is a pretty big behavior change).

any opinion ? 


was (Author: iksaif):
looking at the compaction code, it's unlikely to work because all compactions strategies seem to be using worthDroppingTombstones() and never look at minTTL and maxTTL. Which means that cell deletion times of cells with ttls need to end up in this histogram (or that worthDroppingTombstones() should also take ttls into account using min and max ttl).

any opinion ?

> 33% of compaction time spent in StreamingHistogram.update()
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-13038
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13038
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Corentin Chary
>         Attachments: compaction-streaminghistrogram.png, profiler-snapshot.nps, tombstone-histograms-expiring.patch
>
>
> With the following table, that contains a *lot* of cells: 
> {code}
> CREATE TABLE biggraphite.datapoints_11520p_60s (
>     metric uuid,
>     time_start_ms bigint,
>     offset smallint,
>     count int,
>     value double,
>     PRIMARY KEY ((metric, time_start_ms), offset)
> ) WITH CLUSTERING ORDER BY (offset DESC);
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '6', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold': '6'}
> Keyspace : biggraphite
>         Read Count: 1822
>         Read Latency: 1.8870054884742042 ms.
>         Write Count: 2212271647
>         Write Latency: 0.027705127678653473 ms.
>         Pending Flushes: 0
>                 Table: datapoints_11520p_60s
>                 SSTable count: 47
>                 Space used (live): 300417555945
>                 Space used (total): 303147395017
>                 Space used by snapshots (total): 0
>                 Off heap memory used (total): 207453042
>                 SSTable Compression Ratio: 0.4955200053039823
>                 Number of keys (estimate): 16343723
>                 Memtable cell count: 220576
>                 Memtable data size: 17115128
>                 Memtable off heap memory used: 0
>                 Memtable switch count: 2872
>                 Local read count: 0
>                 Local read latency: NaN ms
>                 Local write count: 1103167888
>                 Local write latency: 0.025 ms
>                 Pending flushes: 0
>                 Percent repaired: 0.0
>                 Bloom filter false positives: 0
>                 Bloom filter false ratio: 0.00000
>                 Bloom filter space used: 105118296
>                 Bloom filter off heap memory used: 106547192
>                 Index summary off heap memory used: 27730962
>                 Compression metadata off heap memory used: 73174888
>                 Compacted partition minimum bytes: 61
>                 Compacted partition maximum bytes: 51012
>                 Compacted partition mean bytes: 7899
>                 Average live cells per slice (last five minutes): NaN
>                 Maximum live cells per slice (last five minutes): 0
>                 Average tombstones per slice (last five minutes): NaN
>                 Maximum tombstones per slice (last five minutes): 0
>                 Dropped Mutations: 0
> {code}
> It looks like a good chunk of the compaction time is lost in StreamingHistogram.update() (which is used to store the estimated tombstone drop times).
> This could be caused by a huge number of different deletion times which would makes the bin huge but it this histogram should be capped to 100 keys. It's more likely caused by the huge number of cells.
> A simple solutions could be to only take into accounts part of the cells, the fact the this table has a TWCS also gives us an additional hint that sampling deletion times would be fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)