You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jeff Jirsa (JIRA)" <ji...@apache.org> on 2016/09/19 04:57:21 UTC
[jira] [Commented] (CASSANDRA-11218) Prioritize Secondary Index rebuild

    [ https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15502288#comment-15502288 ] 

Jeff Jirsa commented on CASSANDRA-11218:
----------------------------------------

I have a version of this patch I'll be submitting very soon, but while I wait for internal approvals, I'd like to describe the implementation so that those of you who care about this can provide feedback conceptually before I submit a patch for review.

I'm implementing this as a priority queue that uses a custom comparator implemented with three tiers:

* Operation type priority (to allow certain types - like index rebuild - to run at higher priorities, and others - scrub / cleanup / verify - to run at much lower priorities). This is defined as an int field in the enum in the OperationType, and can be overridden via system property. Lot of opportunity for bike shedding here in picking exact priorities - I've chosen (highest priority to lowest):

** Anticompaction
** Index Build / View Build
** Key Cache Save / Row Cache Save / Counter Cache Save
** User Defined Compaction
** Compaction (including maximal/major compaction)
** Tombstone Compaction
** Scrub / Cleanup / Upgrade SSTables
** Index Summary Redistribution
** Verify

* Sub type priority (to allow compaction tasks within a type to have preference - to enable behavior like CASSANDRA-6288 ). This is defined as a long, and set by the compaction strategies, and by default, I'm setting this as the bytes on disk of the source sstables - larger transactions (at the time the task was created) preferred over smaller transactions. 

* Timestamp priority, where tasks with the same type/subtype values are served FIFO.

The implementation here was pretty straight forward - we create a new interface to expose the three priority values, and then extend AbstractCompactionTask and de-anonymize the handful of anonymous runnables/wrapped runnables/callables to implement that interface so they can be sorted in the PriorityBlockingQueue. 

There may an opportunity to try to get clever to protect against starvation in under-resourced systems, such as increasing type priority over time as tasks age, but I'm leaving that as a potential optimization for the future - I'm not sure it's really needed, it makes reasoning about compaction harder, but maybe there exists a use case where it's necessary. 

Expecting to submit the patch early this week - if either of you (Sankalp / Marcus) finds this approach conflicts with your expectations, or if you want to volunteer to review, let me know.

> Prioritize Secondary Index rebuild
> ----------------------------------
>
>                 Key: CASSANDRA-11218
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Jeff Jirsa
>            Priority: Minor
>
> We have seen that secondary index rebuild get stuck behind other compaction during a bootstrap and other operations. This causes things to not finish. We should prioritize index rebuild via a separate thread pool or using a priority queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)