You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2018/10/18 06:45:00 UTC
[jira] [Commented] (KUDU-1400) Improve rowset compaction policy to
consider merging small DRSs
[ https://issues.apache.org/jira/browse/KUDU-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654712#comment-16654712 ]
Will Berkeley commented on KUDU-1400:
-------------------------------------
Todd implemented a configurable flushing time threshold ({{–flush_threshold_secs}}) in 8d026474be, a long time ago.
I've written a [design doc|https://docs.google.com/document/d/1yTfxt0_2p5EfIjCnjJCt3o-nB9xk-Kl2O8yKTA1LQrQ/edit#heading=h.5z0d0yyd9zfk] for improvements to compaction policy that should also help with this issue.
> Improve rowset compaction policy to consider merging small DRSs
> ---------------------------------------------------------------
>
> Key: KUDU-1400
> URL: https://issues.apache.org/jira/browse/KUDU-1400
> Project: Kudu
> Issue Type: Improvement
> Reporter: Binglin Chang
> Assignee: Will Berkeley
> Priority: Major
>
> We see some small table with light write load generate lot's of small DRS(~1MB), since those DRSes do not overlap much, they don't get the chance to be compacted, generating lot of very small files/blocks. So:
> # Compaction solution value should consider benefits of merging small DRS
> # Every 2 min flushing MRS(small or large) seems suboptimal, maybe flushing small MRS should have "lower priority" than rowset compaction with higher solution value?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)