You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Yingchun Lai (JIRA)" <ji...@apache.org> on 2019/05/21 10:55:01 UTC

[jira] [Comment Edited] (KUDU-2824) Make some tables in high priority in MM compaction

    [ https://issues.apache.org/jira/browse/KUDU-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844721#comment-16844721 ] 

Yingchun Lai edited comment on KUDU-2824 at 5/21/19 10:54 AM:
--------------------------------------------------------------

[https://gerrit.cloudera.org/c/12852/] 

This patch allows administators to specify different priorities for tables by `gflags`, these maintenance OPs of these high priority tables have greater chance to be launched.
 # Define several integer priority levels for tables, like [-5, 5], default level is 0. The bounds can be specified by `--max_priority_range`
 # Each level has a multiplier on the op scores, Level N's multiplier is `base_multiplier^N`, base_multiplier is some like 1.1. It can be configured by `–maintenance_op_multiplier`
 # Priority multipliers only applied to the compaction ops to improve performance, NOT applied to GC ops and flush ops, they can work like before.

Currently, only support to specify prioritizing/deprioritizing levels to tables by gflags, like `--maintenance_manager_table_priorities=table_id_1:-5;table_id_2:-1;table_id_3:0;table_id_4:3;table_id_5:4`

We can improve it after table 'extra config' functionality provided.


was (Author: acelyc111):
[https://gerrit.cloudera.org/c/12852/] 

 

This patch allows administators to specify different priorities for tables by `gflags`, these maintenance OPs of these high priority tables have greater chance to be launched.
 # Define several integer priority levels for tables, like [-5, 5], default level is 0. The bounds can be specified by `--max_priority_range`
 # Each level has a multiplier on the op scores, Level N's multiplier is `base_multiplier^N`, base_multiplier is some like 1.1. It can be configured by `–maintenance_op_multiplier`
 # Priority multipliers only applied to the compaction ops to improve performance, NOT applied to GC ops and flush ops, they can work like before.

Currently, only support to specify prioritizing/deprioritizing levels to tables by gflags, like `--maintenance_manager_table_priorities=table_id_1:-5;table_id_2:-1;table_id_3:0;table_id_4:3;table_id_5:4`

We can improve it after table 'extra config' functionality provided.

> Make some tables in high priority in MM compaction
> --------------------------------------------------
>
>                 Key: KUDU-2824
>                 URL: https://issues.apache.org/jira/browse/KUDU-2824
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tserver
>    Affects Versions: 1.9.0
>            Reporter: Yingchun Lai
>            Assignee: Yingchun Lai
>            Priority: Minor
>              Labels: MM, compaction, maintenance, priority
>
> In a Kudu cluster with thousands of tables, it's hard for a specified tablet's maintenance OPs to be launched when their scores are not the highest, even if the table the tablet belongs to is high priority for Kudu users.
> For example, table A has 10 tablets and has total size of 1G, table B has 1000 tablets and has total size of 100G. Both of them have similar update writes, i.e. DRSs have similar overlaps, similar redo/undo logs, so they have similar compaction scores. However, table A has much more reads than table B, but table A and B are equal in MM, their DRS compactions are lauched equally, we have to suffer a long time util most of tablets have been compacted in the cluster to achieve a fast scan.
> So, maybe we can introduce some algorithm to detect high priority tables and speed up compaction of these tables?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)