You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "T Jake Luciani (JIRA)" <ji...@apache.org> on 2015/05/12 22:54:00 UTC
[jira] [Commented] (CASSANDRA-9365) Prioritize compactions based on
read activity
[ https://issues.apache.org/jira/browse/CASSANDRA-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540705#comment-14540705 ]
T Jake Luciani commented on CASSANDRA-9365:
-------------------------------------------
[~jbellis] that's within a table. I'm talking about across tables.
Thought experiment: You have 10 tables and 4 compaction slots. You write evenly to all tables but 1 gets 80% of the reads. You want the one with reads to get the most compaction time.
> Prioritize compactions based on read activity
> ---------------------------------------------
>
> Key: CASSANDRA-9365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9365
> Project: Cassandra
> Issue Type: Improvement
> Reporter: T Jake Luciani
> Fix For: 3.x
>
>
> The main purpose of compaction is to keep reads fast by consolidating tables together to avoid merging on read.
> In a cluster with many tables we currently treat all pending compaction as equal. When in reality we may only be reading mainly from one of the tables.
> Rather than FIFO we should prioritize access to the compactors based on read activity. SStables per read might be a good metric. Also, we would need to be sure to be fair to other tables over time. This would be a way to skew the work towards the tables who need compaction the most.
> It might also be nice to offer a nodetool command to kill specific compaction jobs in progress that are not important under load.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)