You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "graham sanderson (JIRA)" <ji...@apache.org> on 2014/08/06 10:25:12 UTC

[jira] [Commented] (CASSANDRA-6945) Calculate liveRatio on per-memtable basis, non per-CF

    [ https://issues.apache.org/jira/browse/CASSANDRA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087410#comment-14087410 ] 

graham sanderson commented on CASSANDRA-6945:
---------------------------------------------

Made some comments in CASSANDRA-6944 related to this change (they perhaps belong here)

> Calculate liveRatio on per-memtable basis, non per-CF
> -----------------------------------------------------
>
>                 Key: CASSANDRA-6945
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6945
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.0.7
>
>
> Currently we recalculate live ratio every doubling of write ops to the CF, not to an individual memtable. The value itself is also CF-bound, not memtable-bound. This is causing at least several issues:
> 1. Depending on what stage the current memtable is, the live ratio calculated can vary *a lot*
> 2. That calculated live ratio will potentially stay that way for quite a while - the longer C* process is on, the longer it would stay incorrect
> 3. Incorrect live ratio means inefficient MeteredFlusher - flushing less or more often than needed, picking bad candidates for flushing, etc.
> 4. Incorrect live ratio means incorrect size returned to the metrics consumers
> 5. Compaction strategies that rely on memtable size estimation are affected
> 6. All of the above is slightly amplified by the fact that all the memtables pending flush would also use that one incorrect value
> Depending on the stage the current memtable at the moment of live ratio recalculation is, the value calculated can be *extremely* wrong (say, a recently created, fresh memtable - would have a much higher than average live ratio).
> The suggested fix is to bind live ratio to individual memtables, not column families as a whole, with some optimizations to make recalculations run less often by inheriting previous memtable's stats.



--
This message was sent by Atlassian JIRA
(v6.2#6252)