You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2016/04/07 18:07:25 UTC

[jira] [Commented] (KUDU-1400) Improve rowset compaction policy to consider merging small DRSs

    [ https://issues.apache.org/jira/browse/KUDU-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230475#comment-15230475 ] 

Jean-Daniel Cryans commented on KUDU-1400:
------------------------------------------

bq. Compaction solution value should consider benefits of merging small DRS

Is it only a problem for you since you're using the file block manager? The LBM wouldn't be affected as much, right? I guess scans may also be slower, but then again you said it's a small table.

bq. Every 2 min flushing MRS(small or large) seems suboptimal, maybe flushing small MRS should have "lower priority" than rowset compaction with higher solution value?

You're talking about the situation where there's nothing urgent that needs doing and we just rely on the perf improvement, right?

That part can definitely use some more tuning. The problem with lowering the MRS flush priority is that you don't want to hold on to WALs for too long. Maybe 2 minutes is too short, but even if it was 5 would it still help you or would the small DRS still have a lower priority?

> Improve rowset compaction policy to consider merging small DRSs
> ---------------------------------------------------------------
>
>                 Key: KUDU-1400
>                 URL: https://issues.apache.org/jira/browse/KUDU-1400
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: Binglin Chang
>
> We see some small table with light write load generate lot's of small DRS(~1MB), since those DRSes do not overlap much, they don't get the chance to be compacted, generating lot of very small files/blocks. So:
> # Compaction solution value should consider benefits of merging small DRS
> # Every 2 min flushing MRS(small or large) seems suboptimal, maybe flushing small MRS should have "lower priority" than rowset compaction with higher solution value?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)