You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2012/04/25 21:24:19 UTC

[jira] [Commented] (CASSANDRA-4189) Improve hints replay

    [ https://issues.apache.org/jira/browse/CASSANDRA-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261975#comment-13261975 ] 

Brandon Williams commented on CASSANDRA-4189:
---------------------------------------------

bq. It might be worth shading the hints based on Hour at which the hints are stored. This can reduce the complexity of the scanning for hints.

It's not clear to me that the extra code complexity (and IO) is worth this kind of tradeoff.  Also, on (mis)completion we flush and force a compaction that should clear out the tombstones (see CASSANDRA-3733) so I'm skeptical this is a real problem.

bq. Problem: Hints replay is too slow and single threaded.

I disagree, historically our problem with hints has always been *overload*, which I think we finally got right in CASSANDRA-3554.  Sure, in a two node cluster maybe the single threaded nature is a problem, but in any cluster of appreciable size it's always overload that's an issue, so I don't see much to be gained by multithreading it.
                
> Improve hints replay
> --------------------
>
>                 Key: CASSANDRA-4189
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4189
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>
>
> Problem: Hints are stored in one row.
> when there are a lot of hints stored and we store Tombstones for the ones which has been replayed.
> It might be worth shading the hints based on Hour at which the hints are stored. This can reduce the complexity of the scanning for hints.
> Problem: Hints replay is too slow and single threaded.
> There are use-case where the hints needs to be replayed ASAP to make the cluster more consistent.
> In Multi region cluster, the throttle is already done due to the latency which is in the order of 100's of millisecond.
> It might be worth trying to replay the hints in parallel and throttle on the number of bytes read from the disk or use the existing setting of throttle based on sleep interval on all the threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira